Blog Archive

The Java programming language – a journey from .java to .class to 0s and 1s

What is Java programming language?


The Java programming language is a general-purpose, platform-independent programming language and a key component of the Java platform. The Java platform also includes other software components, such as the Java compiler, Java Runtime Environment (JRE), development tools, and other integration libraries.

The platform independence is perhaps the most important feature of the Java language. In the next couple of sections, we will learn how platform independence is achieved in Java platform and what are the key building blocks of the Java ecosystem, but before that let’s compare Java with C++.

What makes Java programming language platform-independent and how is it different from C++?


Let’s first understand how does a C++ program work. On Windows platform, when you write a C++ program (myprogram.cpp) and compile it, the system directly converts this program into an .exe file. 

You can use this .exe file on Windows platform only; it makes C++ a platform-dependent programming language.

On the other hand, a Java program execution is essentially a two-step process. In the first step, the Java source file (myprogram.java) is compiled into bytecode (myprogram.class) by the Java compiler (javac).

The Java compiler is a part of the Java Development Kit (JDK). This bytecode is then executed by the Java Virtual Machine (JVM). The JVM uses the .class file as an input and generates the equivalent machine code that your CPU can understand. The JVM is a part of the Java Runtime Environment (JRE). 

Note: Technically, the JVM includes another component called Just In Time compiler (JIT) that helps in improving the performance of a Java application. However, for the sake of simplicity, we will not distinguish between JVM and JIT.

Understanding the concepts of JDK, JRE, and JVM


Software developers often talk about JDK, JRE, and JVM while discussing or troubleshooting a Java program. Let’s learn more about these components.
  • Java Development Kit (JDK): It is primarily for software developers. The JDK includes the Java compiler (javac). The JDK is platform specific. For instance, jdk-8u231-solaris-sparcv9.tar.gz is the JDK for the Solaris operating system that is running on the processor with Scalable Processor Architecture (SPARC) architecture. The JDK also includes several other important files, such as:
    • src.zip: The Java source code.
    • jar.exe: A tool to build Java ARchive (jar) files.
    • javadoc.exe: A command line tool for Javadoc – Javadoc is used to generate API documentation from the Java source files.
    • keytool.exe: A tool to generate encryption keys and certificates and store them in the Java Keystore – an encrypted repository to store keys and certificates.
  • Java Runtime Environment (JRE): Includes the JVM and other Java files. For example, if you search for JVM in the JRE folder on a Windows machine, you will find jvm.dll file. The jvm.dll file is the Windows implementation of the JVM.

Understanding the workflow

  • Using Notepad, Eclipse IDE, or any other text editor, create a Java file on your Windows machine. Assuming that JDK is already installed on this machine, compile this .java file using the Java compiler (javac). After the successful execution of the command, a .class file (bytecode) is generated.
  • You can now use this bytecode on any other machine that has JVM on it — essentially the machine must have JRE installed on it as JVM is a component of JRE. The JVM then creates the machine code based on the platform and creates an executable that the processor can understand. JVM is platform dependent.
    Note: JRE, JVM, and JDK are platform-dependent because the configuration of each operating system is different; however, Java language is platform-independent.

Java and Data Security


In today’s digital economy, every business is a software business and protecting your company’s intellectual capital is the top priority for every software executive. 

The focus of IT expenditure is on preventing cyber-attacks proactively. This is more relevant with the growing adoption of cloud-based applications, As-A-Service models (for instance, Infrastructure as a service and data as a service) and API-driven businesses. 

In this section, let’s understand how Java provides the necessary security for your code and applications. Before that, here’s a quick summary of all the key building blocks of a secured communication between a client and a server.
  • Security protocols (SSL and TLS, for instance): Cryptographic protocols that provide secured communication between two systems. When you connect to an HTTPS secured server, your web browser checks the website’s certificate and verifies if it is issued by a certificate authority (CA).
    Note that now Transport Layer Security (TLS) protocol is being widely used as security vulnerabilities were detected in SSL.
  • Certificate (SSL certificate, for example): It certifies that the certificate requester (a person or an organization) meets all the stringent requirements to receive the certificate. A certificate is issued by a certification authority (CA). For example, VeriSign and DigiCert.
    Certificates are installed on the web server hosting your website. It ensures a secured communication between your website and your customer’s machine. Every web browser also comes with some certificates pre-installed on it.
    For example, when you type chrome://settings/?search=certificate in the Chrome web browser, you can see a list of all pre-installed certificates from the certification authorities.
  • Signature: It verifies that a particular digital document is authentic. In other words, it provides the receiver a guarantee that the message was generated by the intended sender only. Every certificate is encrypted using a signature.



  • Cipher Suite: A set of methods (algorithms) required to secure a network connection through SSL or TLS. The software implementation of this algorithm is done by the software library. For example, OpenSSL. Here's an example of a typical cipher suite:
    TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384
    • TLS indicates the protocol.
    • ECDHE indicates the key exchange algorithm.
    • ECDSA indicates the authentication algorithm.
    • AES_256_CBC indicates the bulk encryption algorithm.
    • SHA384 indicates the MAC algorithm.
Java APIs address a wide range of security areas, such as cryptography, secure communication (SSL and TLS support), keys and certificates management using keytool utility, and security policy. 

The C:\Program Files\Java\jre1.8.0_202\lib\security folder lists other important files. For example, java.security is a master security properties file that is used to manage various aspects of Java security.


Use your storage credit card wisely!

Here’s an attempt to simplify the concept of Thin Provisioning in enterprise storage management. It’s a nice idea, but it has some caveats.

So, let’s take an example of your email service provider (Gmail, Yahoo, Outlook).

Every email service provider wants to increase its customer base, but the actual physical storage is never added in the same ratio. For example, when you create your email account, the first email you get is something like – You have 15 GB of free storage that's shared across Drive, Email, and Photos…

But the email service provider does not keep adding 15 GB of physical storage with each new user. It simply assigns a logical storage space of 15 GB at that point in time. 

From File System perspective, the physical memory is released only when a “write” operation is performed. In simpler terms, the physical memory is released only when you actually save your data. Until then your storage is used by other apps.

So, thin provisioning is essentially storage on demand, and you will get more storage space as you start saving more data until you reach your hard limit that is 15 GB in our example.

The opposite of Thin Provisioning is Thick (fat) Provisioning where the storage admin upfront allocates dedicated storage space for the customer – doesn’t matter how much space the customer uses. Other applications and users are not allowed to encroach into your 15 GB space. 

Essentially, it is allocation of entire space prior to use. Use thick provisioning when you need less latency and less supervision. Not a perfect analogy, but you can think of it as Static versus Dynamic IP address.

Coming back to Thin Provisioning. Is it like using credit card? Utilizing more than what you have at your disposal?

Well, it’s about applying the technology in the right way. Understanding of “data growth projection” and constant monitoring of underlying storage resources are critical parameters of your Thin Provisioning strategy. You need to fine tune data usage threshold levels, so that resource exhaustion alerting is accurate and timely.

What if all your applications (or VMs) need storage at the same time? Extremely unlikely, but possible. Resource exhaustion is not uncommon, and it is up to the storage administrator’s discretion to keep an optimum balance of efficient storage utilization and application availability.

A proper balance between risk-reward and communication between sysadmin who consumes storage and the storage admin who provides storage, is critical.

But yes, we need Thin Provisioning. Imagine YouTube not using Thin Provisioning! How many TBs of data is created every hour on the Internet?

So, from storage optimization standpoint, Thin Provisioning is a big yes. 

TCP/IP and Storage Networking

1 - iSCSI v/s FCoE:

  • iSCSI: An IP-based storage networking standard to execute SCSI commands and enable data transfer between SCSI initiator and target. 
  • FCoE: Ability to encapsulate FC frames in Ethernet and then transport these FC frames over an Ethernet network.
Parameters iSCSI FCoE
Underlying network layers Uses Ethernet, IP, and TCP as the underlying network layers. Uses a new EtherType definition – Data Center Ethernet (DCE); it does not use TCP/IP. FCoE requires lossless Ethernet and it preserves the FC specification/constructs while running on the Ethernet. Since frame drop is not acceptable in FCoE, obviously TCP/IP wouldn’t work.
Network switches Uses traditional Ethernet switches. Uses switches that support datacenter bridging (DCB) protocol.
Adapters Uses NIC. Uses CNA to encapsulate FC frames in an Ethernet frame.
Frame Format An iSCSI frame includes Ethernet Header>IP>TCP>Data>CRC. FCoE is essentially the encapsulation of FC frames in Ethernet frames. Therefore, it includes Ethernet Header>FCoE Header>FC Header> FC Payload>CRC>EOF.



2 - SAN v/s NAS


Parameters SAN NAS
By definition SAN is a dedicated high-speed network that includes severs, VMs, switches, storage etc. A NAS device is essentially a File Server. A NAS device is made up of one or more HDDs, an OS, and Ethernet connection to the network. Examples: Network drives in Microsoft Windows.
How does it store and access data? Block-level data storage and access. Example: The specific location or a range of storage blocks is specified. Hey, give me access to block 1111 to 2222. You get all the data that is stored between these two blocks.   File level data storage and access. Example: Hey, NAS Server, give me file XYZ.
Used protocol FC, FCoE, iSCSI CIFS, NFS
Connectivity Typically uses Fibre Channel connectivity. Uses standard Ethernet connection.
How does the client OS look at it? It appears local to the client. It appears as a remote storage.

3 - Hypervisors:

  • Type 1 hypervisors: The hypervisor (a thin layer of code) sits on top of hardware (bare metal hypervisor). Once configured successfully, you can create VMs on the hypervisor. Examples are ESX, KVM, and Hyper-V. Hardware –> hypervisor –> VMs –> VM operating system –> applications
  • Type 2 hypervisors: The host OS sits on top of hardware and the hypervisor is configured on the OS. Since the hypervisor is now ready, you can proceed with the VM creation. Examples are Solaris Zones and VMware workstation. Hardware –> Host OS –> hypervisor –> VMs –> VM operating system –> applications

4 - Why is Hyper-V a type 1 hypervisor when it runs as a role on Windows Server 2012 R2?


While installing Win Server 2012 R2, you select the roles that you want to enable. For example, DHCP, DNS, Hyper-V etc. When you select Hyper-V role, the system automatically converts the host OS into a VM, and puts the hypervisor below. This VM is referred to as the parent partition in Hyper-V. Once the parent partition is available, you can create child partitions (VMs).    
  • The parent partition in Hyper-V consists of several components: VMI Provider, Virtual Machine Management Service, Kernel, Virtualization Service Provider (VSP).
  • The child partition consists of application, Kernel, and Virtualization Service Consumer (VSC) etc.

5 - Processors, CPUs, Core, and vCPU??


A motherboard (single-socket or multi-socket) can contain one or multiple physical processors (chips) respectively. Each physical processor/chip contains multiple cores. Each such core is seen by the Operating System as an independent CPU.

One step further: In virtualized environment, when you install the hypervisor, each physical CPU is further abstracted into virtual CPUs (vCPUs). Each vCPU divides the available CPU cycles for each core and as a result, multiple VMs can use the CPU time.

This is referred to as CPU scheduling/time sharing. Take a look at the following screenshot from a Windows 10 machine. You have one processor (socket), 2 cores (CPUs), and 4 logical processors (analogous for vCPU).              

The following diagram illustrates the relationship between a processor and its cores:



6 - More questions on networking

  • What is an interframe gap?
    9.6 microseconds of silence is needed to handle any clocking errors. That is, the transmitter waits for 9.6 microseconds before it sends another frame to the destination. Note that with faster networks, this value decreases. For example, its value is much lower in 100 GB Ethernet as compared to a standard 10 MBPS Ethernet.
  • What is MTU?
    The largest IP packet an Ethernet frame can contain. Its maximum value is 1500 bytes. Note that system adds some more bytes on top of it. For example, source and destination MAC addresses, CRC etc.
  • What is a jumbo frame?
    An Ethernet frame with a payload more than the standard MTU, which is 1500 bytes. Mostly, jumbo frames are 9000 bytes in size.
  • Why using jumbo frames improves the network performance?
    Because with more payload per frame (approximately 9000 bytes), your network switch is now able to process more data per frame, and as a result, it has fewer frames to process.
  • How does the Address Resolution Protocol (ARP) work?

    ARP takes care of IP address to MAC address mapping and it involves IP packets and frames. Each computer has two addresses – an IP address (32-bit) and a MAC address (48-bit).

    The IP address is essentially a logical address assigned dynamically by the DHCP. The IP address takes care of addressing at the Network layer. The MAC address, on the other hand, is hard-coded onto your computer's NIC, and it works at the Data Link layer.
    An IP packet has the IP addresses of the source and destination machines. Since we are using Ethernet for data transmission, we need to encapsulate this IP packet into an Ethernet frame. The Ethernet frame includes the MAC addresses of the source and destination machines.

    Let’s assume that a computer (A) wants to send data to another computer (B). So, it needs to know – (1) IP and (2) MAC address of the destination. The DNS helps to find out the IP address. The remaining component is MAC address of computer B. To get the MAC address of computer B, first the computer A sends an ARP request. It’s a broadcast and all the computers in the segment will receive this broadcast.
    An ARP request is essentially a way of broadcasting a message.
    Something like: Hi, My IP address is 1.2.3.4 and my MAC address is XXXXX. I am looking for the MAC address of the destination machine whose IP address is 11.22.3.44. The machine having 11.22.33.44 IP address will then reply with its MAC address. This is known as the ARP reply. The source machine (A) receives the response and finally gets the MAC address of the destination machine. The destination machine (B) will also update its ARP table with the MAC addresses of the source machine (A).

    After the successful completion of one ARP cycle (request and response), both the systems will update their respective ARP tables with each other’s IP and MAC addresses. Now computer A has everything (IP+MAC addresses) it needs to send the Ethernet frame to computer B. How does the ARP packet look like? It includes hardware type, protocol type, hardware address length, protocol address length, operation, sender hardware (MAC) address, target hardware (MAC) address, sender IP address, target IP address etc. An ARP will either be an ARP request (operation field's value is 1) or an ARP reply (operation field's value is 2)

Security, Certificates, Ciphers, and Data Encryption fundamentals

Why do we need ExportPrivateKey.jar utility in SAP ASE?

We have seen a transition in the cryptographic libraries that SAP Adaptive Server Enterprise (SAP ASE) uses for encryption. 
  • First, SAP ASE started using OpenSSL libraries instead of Certicom security libraries.
  • Since there were securities vulnerabilities (for example, Sweet32) in OpenSSL lib, SAP ASE moved from OpenSSL to SAP CCL (sapcrypto.dll), which is SAP's own proprietary cryptographic library. 
Note that SAP ASE can handle certificates only in PKCS8 format. Since OpenSSL is not being used anymore and Java Keytool also cannot convert a certificate in PKCS8 format, SAP introduced ExportPrivateKey.jar utility that lets you convert a certificate in PKCS8 format so that SAP ASE can process it.

Building blocks of a secured communication

To establish a secured communication between a client and server, various components and security protocols are used. 

In this post, we'll learn about the following:
  • Protocols that are used (SSL and TLS, for instance)
  • Certificate (SSL certificate, for example)
  • Signatures
  • Ciphers
  • Cipher Suites (strong/weak/FIPS-compliant) and the order in which these cipher suites are negotiated between a server and its client.
  • Algorithm and functions (hash function, for example)
  • Libraries (Certicom/OpenSSL) and tools to generate certificate. For example, OpenSSL is a library and openssl is the tool to generate SSL certificate.
  • Compliance status of the generated certificate (FIPS compliance, for example)

Computer Networking Fundamentals

The 7-layers OSI reference model (important from academic standpoint) and 4-layers TCP/IP model (practically used in day-to-day networking) help you easily understand networking features, devices, layer dependencies, and packet/frames forwarding mechanism/decisions.
  • Different protocols are used:
    • Application layer: DHCP, DNS, FTP, HTTP, SSH, SSL/TLS, and POP
    • Transport layer: TCP (connection-oriented) and UDP (connection-less)
    • Network layer: IP (v4 and v6)
    • Data Link layer: ARP, OSPF, and Ethernet
      ARP protocol maps the MAC address with IP address (MAC-IP translation). Each device maintains its ARP cache. Different software ports are used for different protocols. For example, port 80 for HTTP and port 443 for HTTPS (uses SSL certificate).
      NOTE: DNS (used for name system), DHCP (used for host configuration/dynamic IP allocation), SMTP, POP, IMAP (used for e-mail) use TCP at the underlying protocol. On the other hand, FTP (used for file transfer), HTTP/HTTPS (for Web), and SNMP (used for network management) use UDP as the underlying protocol.
  • Different physical devices are used. For example, switch at the Data Link layer, hub at the Physical layer, and router at the Network layer. A router has 2 IP addresses - public and private IP.
  • Different packet forwarding mechanisms are used. For example, a switch's packet forwarding mechanism is based on MAC address (MAC Address table) and a router takes its decisions based on IP addresses (it uses routing table). Hub simply repeats or broadcasts; it has no intelligence.
    A switch, with its empty MAC address table, behaves just like a hub in its very first cycle. In the next cycle, it learns the MAC addresses/ports, updates its MAC address table, and thereafter sends the packets intelligently to the respective ports.
  • Different terminologies for data at different layers: For example, bit (at Physical layer), frames (at the Data Link layer), and packets (at the Network layer).
When data is sent from source to destination, it travels through all the 4 layers of TCP/IP stack. At the source, as the data passes through each layer, additional piece of information is appended to it. For example, a Segment, at the Transport layer, becomes a packet when source and destination IP addresses are added to it at the Network layer.
Similarly, a packet becomes a frame at the Data Link layer. A frame has the physical address (MAC address) of the two nodes the data is travelling between.
Here's is a quick reference:
  • At the Transport layer: Data + TCP header = Segment
  • At the Network layer: Segment + IP header = Packet
  • At the Data Link layer: Packet + Ethernet Header = Frame
The sequence of this assembly/encapsulation is very important. The reverse order of freeing the data from all information (essentially, disassembly/de-encapsulation) is followed at the destination.


Data format at each layer:

  • At the Transport layer: [UDP/TCP Header + UDP/TCP Data]
    At the Transport layer, TCP (connection-oriented) and UDP (connection-less), both protocols are used. The data unit is Segment for TCP and Datagram for UDP.
  • At the Network layer: [IP Header + IP Data]: Collectively known as a Packet.
  • At the Data Link layer: [Frame Header + Frame Data + Frame Footer]: Collectively known as a Frame/Ethernet Frame.

Details on each type of header: 

  • For Segment
    • TCP Segment: Source Port, Destination Port, Acknowledgement Number, Control Bits, Checksum etc.
    • UDP Datagram: Source Port, Destination Port, Checksum etc.
  • For Packet
    • IPv4 Packet Fields: Version (identifies the IP version - IPv4 or IPv6), Internet Header Length, Time-to-Live, Header Checksum, Source IP Address, Destination IP Address etc.
    • IPv6 Packet Fields: Version, Flow Label, Payload Length, Source Address (128-bit), Destination Address (128-bit).
  • For Frame
    • Generic Frame Fields: Frame Start, Addressing, Type, Control, Data, Error Detection, and Frame Stop
    • Ethernet II Frame Fields: Preamble, Destination Address, Source Address, Data, etc.
    • IEEE 802.3 Frame Fields: Preamble, Destination Address, Source Address, 802.2 Header and Data, etc.
    • IEEE 802.3ac Frame Fields: Preamble, Destination Address, Source Address, 802.1Q VLAN Tag, 802.2 Header and Data etc.
    • PPP Frame Fields: Flag, Address, Control, Protocol, Data etc.
    • Wireless Frame Fields: Frame Control, Destination Address, Source Address, Receiver Address, Transmitter Address, Frame Body etc.


Understanding Enterprise Architecture (EA)

The concept of Enterprise Architecture (EA) integrates the technology and business logic and presents a more comprehensive view of the entire organization and the business process flow.

It enables the business manager as well as the technology mangers to view the entire business process in a more integral manner. It provides Contextual, Conceptual, Logical, Physical and Functioning model of your business process. It also addresses the What, How, Where, Who, When and Why of the business.

This process provides an entire blueprint of the business. For example, to construct a complex building/structure, the first step is to have a blueprint and modelling of building, which is done by the architect.
And then, there are various other views. For example, planner’s view, owner’s view, designer’s view, and the builder’s view.

Similarly the EA also creates a working business model with different views. There have been many attempts to define EA. But the most popular approach to define EA is Zachman’s Framework, proposed by "John Zachman", an ex-IBM engineer.
Zachman’s Framework: It divides the entire organization, based on following six parameters:
  • Data (What)
  • Functions (How)
  • Network (Where)
  • People (Who)
  • Time (When)
  • Motivation (Why)
After this division, it generates different views, for business managers and for IT managers. It also uses a separate repository / database to store information and documents of all such business processes. The business managers and the IT managers can extract the required information from this repository.

It also enables them to align the various business processes with IT infrastructure. For example, ORDER PROCESS (which uses a CRM system), ASSEMBLY AND DELIVERY PROCESS (which uses a SCM system) or BILLING PROCESS (which uses an ERP system).

The Enterprise Architecture provides a more direct and efficient control over all the business process and information technology resources. It also provides better insight for strategic planning, keeping the organizations ahead in the global competitive environment.

Understanding Product Lifecycle Management (PLM)

The Product Lifecycle Management or PLM (earlier known as PDM) is a term used to define processes and business strategies to manage the entire Life Cycle of a product or service (from Concept and Design, to Production Analysis, Research & Development, Production, Marketing, Maturity, Saturation, and eventually withdrawal from the market.

In other words, a PLM system captures, stores, and manages the data for the entire product development spectrum (from cradle to grave).

It is important to note that PLM is not a piece of information or just a technology (or a software application), but it is an extended enterprise approach that facilitates sharing of product information (outside design house) across all departments of an organization (Marketing, After sales, Support and so on) by linking People, Processes, Business with the Product Information.

The prime objective of a PLM system is to effectively manage the Corporate Intellectual Capital (CIC) of an organization. The CIC includes:
  • Product Definition: All information relating to what the product is, its specifications, how it is designed, manufacturing details, support and so on.
  • Product History: Any information relating to what the organization has done in past (in relevance of the product).
  • Best Practices: The experience gathered by the organization, in the process of developing the product. The best practices are critical factors while deciding future business strategy for the product.
A typical PLM system consists of software (web servers, application server, database servers, and front-end applications), middleware and hardware. A PLM application can also be linked with other enterprise applications running in an organization (ERP, CRM, SCM, ECM, Business Intelligence application, to name a few).

The PLM application extracts product related information from all these applications and put them in single repository. From this repository, various users can access the information, and use it for better decision making and formulating future strategies.

Due to historical reasons, the PLM has been used primarily for automobiles, aerospace, and machine designs (in order to get the maximum benefit from digital manufacturing) but now PLM is being implemented in industries, beyond engineering. For example, Medical, Banking, Insurance, Genetic research, and Pharmaceutical.

In the digitized economy, PLM helps organizations to operate globally, produce high quality product with less manufacturing time, and enables them to stay ahead of competitors.

The Product Lifecycle Management is the answer to the today’s design-driven, customer centric business needs.

N_Port ID virtualization (NPIV)

Before starting with NPIV, it's important to understand the different types of ports used in a typical FC fabric.

  • N_Port: The end node of your FC fabric. For example, the port on an HBA.
  • F_Port: It's on FC switch and connects FC switch with the HBA. Essentially, it’s a connection between the N_Port on an HBA and the F_Port on a Fibre Channel switch.
  • E_Port: It connects one Fibre Channel switch in the SAN fabric to another Fibre Channel switch.
During the FLOGI process, the Fabric Login Server gives a 24-bit address to the N_Port of HBA. This 24 bit address is the N_Port_ID. So, in this scenario, this N_Port has only one N_Port_ID associated with it.

Note that there is another unique identifier for the port referred to as the WWPN and there is a one-to-one mapping between WWPN and N_Port_ID.
In summary, we can say that for this particular physical N_Port, we have ONLY ONE WWPN and ONLY ONE N_Port_ID.

Now, if this physical N_Port is NPIV enabled, it has the capability to have multiple WWPNs and multiple corresponding N_Port_IDs associated with them.

After the completion of FLOGI process, the NPIV-enabled physical N_Port issues additional commands to register more WWPNs and receive more N_Port_IDs. So, now we have multiple WWPNs for my single physical N_Port Node.

With NPIV, the physical N_Port can register additional WWPNs (and N_Port_IDs), and as a result, only required VMs will be able to see the required LUNs. Not all VMs can see all LUNs.

In a nutshell: NPIV feature provides the ability to your normal physical N_Port switch to create multiple WWPNs and N port IDs.

Take a look at the following diagram to understand N, F, and E type ports:


File Retention Policies in EMC Celerra

One of EMC’s NAS devices, Celerra, uses WORM (Write Once Read Many) mechanism to protect files and directories from deletion or alteration. You can specify the file retention period. For this period no user except the storage admin can edit or delete the files.
In this process of marking a file as WORM, the file goes through three stages – CLEAN, WORM, and EXPIRED. This WORM capability of Celerra is based on two NAS protocols – NFS and CIFS. You can mark a file as WORM using one of the following options:

  • The Data Access in Real Time (DART) operating system lets you configure the retention period.
  • Alternatively, you can also use Celerra’s file-level retention property to designate a file as WORM (known as CWORM). This is a part of Celerra File Server configuration.

Better Storage Utilization with Dynamic Provisioning

De-duplication and thin provisioning are among the preferred techniques that can help you maximize your storage usage substantially.

In de-duplication, removal of redundant data helps you reclaim more server space.

In thin provisioning, the storage administrator allocates mere “logical storage” to an application. Technically, the physical memory is released only when a “write” operation is performed in File System.

Thin reclamation for expensive tier 1 and 2 storage


Upon deletion of files, the resultant empty storage is automatically (and periodically) reclaimed by the File System, and is added to the common storage pool.

This automatic reclamation also enables you to maximize the utilization of expensive tier 0, 1, and 2 storage media.

Along with reclamation, “Automated Tiering” of data allows you to move non-critical data from expensive tier 0/1 media to economical tier 2/3 storage.

For example, “Easy Tier” for IBM System Storage® DS8700 and “3PAR Adaptive Optimization” software.

Thin provisioning is not one-size-fits-all solution


It’s about applying the technology in the right way. And for thin provisioning, understanding of “data growth projection” and constant monitoring of underlying storage resources are critical parameter of your thin provisioning strategy.

You need to fine-tune data usage threshold levels, so that resource exhaustion alerting is accurate and timely.

Ability to forecast the application data usage trajectory also enables you to spare storage from the storage pool for other applications.

Thin provisioning is, after all, about utilizing more than you have at your disposal (much like credit cards). What if all your applications need storage at once?

Resource exhaustion is not uncommon, and it is up to the administrator’s discretion to keep an optimum balance of efficient storage utilization and application availability.

Virtualization has put a lot of pressure on storage – you can instantly create VMs with configurable storage.

The gold images of VMs are created with much more storage than needed (and they remain underutilized). Of course, thin provisioning, in such scenario, a great tool to increase storage utilization.

Use proper storage allotment policies

You need tools that can compare logically allocated storage with the actual available storage space and validate current storage allocation.

Solid-State Devices: Answer to the Storage Needs of The Most Demanding Businesses

Like Storage Virtualization and Thin/Virtual Provisioning, Solid State Devices (SSDs), too, are ready to change the SAN landscape, and transform the storage economics dramatically.

Absence of moving mechanical part (no magnetic platters and drive heads) makes SSDs more reliable and better performer with less access time, and latency.

Unlike HDDs, which use elector-mechanical movements and magnetic data storage, an SSD is essentially a collection of semiconductor memory. With its tremendous speed (as high as 500 MB/s - sandisk x100), SSDs outpace our traditional HDDs.


SSDs for tier 0 storage?


Some of the prominent characteristics of SSDs are faster boot-up time and file search, quicker virus scan, less heat generation, and minimal downtime. Minimum downtime of application is the key requirement for High Availability (HA) of applications, and a significant attribute/prerequisite to match stringent SLAs.

For memory intensive applications (transactional data, mission critical applications, CAD, Business Intelligence and Databases), SSDs are the ideal candidate since superior performance and unrivaled reliability are their key differentiators.

They can be used for tier 0 storage in Storage Area Network (SAN) as well as Network Attached Storage (NAS).

Vendor like Violin Memory also offers flash memory arrays (example - 6000-Flash-Memory-Array) for enterprise-grade application. It offers fully redundant components, hot-swap option, and no single point of failure.


SSD market


SSD industry is growing at a phenomenal pace, and it is on its way to becoming a prominent IT/storage trend. Many leading vendors (Fusion-io, Violin Memory, STEC, to name a few) are providing a range of software for SSDs - for Bigdata, and Cloud Computing, to support for various virtualization technologies (VMware, IBM LPAR/VIOs and so on).

The SSDs support various product interfaces. For instance, Fibre Channel (FC), PCI, Serial Attached SCSI, Universal Serial Bus (USB) and so on.

Interoperability?


A typical datacenter is a heterogeneous environment that includes server and storage from multiple vendors. Virtualization of network, server, storage, and application further adds a layer of complexities.

Eventually, it becomes a daunting task to manage, configure, and monitor these entities. Interoperability and the Centralized Management is probably the answer to this problem.
In HDD array segment, several initiatives are being taken to address this issue. For example, Hitachi Data Systems (HDS) and the Storage Network Industry Association (SNIA) launched Storage Management Initiative (SMI) advocating the importance of open storage network management technology. Example, IBM XIV arrays can be managed by HITACHI HiCommand server.
Similarly, for HDDs as well, interoperability issue is being discussed. The Solid-State Storage Initiative Technical Group (a part of SNIA) is working to address the SSD interoperability challenges.

EMC VPLEX and Storage Array Virtualization

EMC VPLEX array supports storage virtualization. It means that one or more back-end arrays (for example, Symmetrix) can provide storage to the VPLEX array. The storage that is used in a VPLEX cluster can be physical or virtual.
In this post, we'll discuss about virtualized storage for VPLEX.

Storage Volume (SV): Essentially, it's the storage provided by the back-end array. On top of SVs, you can create Extents. There can be multiple Extents on one SV, or you can create one-to-one mapping between an Extent and a Storage Volume.

Using these Extents, you can then create Devices. Again, you can form a Device using multiple Extents, or create one-to-one mapping between an Extent and a Device.

And lastly, Virtual Volumes (VV) are created from these Devices. A VV can be created using a normal Device or a Distributed Device. If you have used a normal Device for the creation of VV, those VVs are referred to as simply Virtual Volumes. On the other hand, if you have used a Distributed Device for the creation of VV, those VVs are referred to as Distributed VVs.