The annual Flash Memory Summit was virtual this year, like many events. It was also delayed from August to early November. At the 2020 event there were speakers addressing the development and use of solid state memory technologies, and interfaces and even the announcement of a new alliance, focused on the development and promotion of DNA for digital storage. Let’s look at some of these developments.
Amber Huffman, Intel Fellow and President of NVM Express spoke about now NVMe is fixing the memory and storage hierarchy, creating architectural advances and enabling computational storage. In the course of her talk she mentioned Intel’s 2nd generation of Optane memory with 4 layers of cells (up from 2 layers in the original memory) announced in the Summer of 2020. The figure below shows new memory solutions, using NVMe that are filling in the storage and memory hierarchy.
She also said that the NVMe’s base specifications with fabrics has helped to reduce the complexity of implementing new storage solutions including zoned namespaces and computational storage and that this will create an extensible infrastructure that will take us through the next decade of growth. Below is an updated roadmap for NVMe showing the introduction of the simplifying base specification in the second half of 2020.
The Namespace Types will enable alternate command sets that can enable block I/O as well as key value and zoned storage within an NVM subsystem. Zoned namespaces in particular can reduce write amplification and overprovisioning in an SSD. Endurance groups are also supported allowing flexible capacity management based on access patterns and media types. The new specifications also enable domains and partitions that can support larger scale storage systems.
Intel’s keynote by Alper Ilkbahar focused on the company’s Optane persistent memory. He pointed out that DRAM scaling has slowed somewhat with density gains going from 133% per year to 100% per year and he projected that this annual DRAM density increase would go to 50% going forward. This is causing a gap between compute memory requirements and DRAM scaling that Optane memory can help fill. Intel sees Optane memory filling roles in both memory and storage as shown below.
Intel Optane PMRM 200 DIMM memory series uses the company’s second generation 4-layer memory and is optimized for Intel’s 3d generation Intel Xeon scalable processors. These offer 4.5TB per DIMM socket and an average of 25% higher memory bandwidth compared to the prior generation product. Moving more data access to byte level, rather than block level, improves storage performance. Distributed asynchronous object storage (DAOS) with a DAOS storage engine allows storing metadata, low latency I/Os and indexing in the Optane DiMMS while bulk data is kept in SSDs or HDDs.
This is seen as an advance from Posix storage and scales for structured and unstructured data and can accelerate data analytics and AI computations. Testing with Oracle showed that an Exadata X8M with a persistent memory data accelerator based upon Optane PM improved IOPs by 2.5X and reduced latency by 10X. Also, remote direct memory access (RDMA) to Optane PM improved replication latency compared to replication to SSDs.
Marvell’s Thad Omura gave a keynote talk and spoke about how to improve NVMe utilization of NVME in servers and data centers and DRAM-less low power design in small form factor solutions. In particular, he spoke about a new native Ethernet SSD with Kioxia as shown in the image below. This product has dual 25Gb/s Ethernet ports and allows a direct interface between the SSD and an Ethernet-based NVMe-oF SSD.
This enables interesting architectures such as an Ethernet Bunch of Flash (EBOF), which the company spoke about working on with Micron, showing performance (4X greater throughput) and latency (72% lower latency) improvements hooking an EBOF to an NVIDIA DGX with GPUDirect s torage, with host CPU bypass.
IBM gave a keynote talk advocating the Use of QLC NAND flash, even in primary storage. The company said that they have enabled QLC with TLC like endurance and good performance. A key element in accomplishing the use of QLC flash is their FlashCore Modules (FCMs) with dual port NVMe U.2 SSDs. QLC endurance is accomplished with data compression, enhanced garbage collection with low write amplification, dynamic SLC allocation smart data placement and other IBM IP.
The company also spoke about computational storage using cores and FPGAs on SSDs on their FCMs, based upon SNIA committee developments. They also talked about incorporating persistent memory potentially to displace battery powered DRAMs and providing PM caches and metadata stores and also using CXL to create composable storage.
NEO Semiconductor also gave a keynote talk about combining QLC density with SLC speed. QLC cells are fundamentally slower than TLC cells and SLC cache provides performance improvements except in write intensive environments when the SLC cache is full the write speed decreases considerably. Neo is increasing write throughput by increasing the number of planes in the bitline direction to increase write parallelism.
This allows more pages to be programmed at the same time and reduces the bit line length, which reduces the read time. In order to reduce the resulting increase in page buffers, the X-NAND architecture uses one page buffer to read and write 16 or more bit lines in parallel. As a result, the number of page buffers in each plane is reduced by 1/16. This allows increasing the number of planes by 16 times without increasing the die size and thus increases write/read throughput by 16X as shown below.
As a result, Neo says that random read and write speeds are improved by 3X and sequential read and write speeds are improved by 27X and 14X respectively. The company says that these performance gains are achieved with QLC die costs. Neo says that the same sort of performance increases are possible with PLC flash memory as well. The company says that this multiple bit line approach can also be used with CMOS under array NAND flash and when applied to NAND in 3D chip integration enables NAND performance that is not a bottleneck.
Numan, a MRAM startup, gave a keynote talk at the 2020 FMS. The spoke about their work to provide persistent memory for a deep neural network accelerators for NASA’s space programs. They pointed out that logistical issues make it hard to do computation in remote data centers for space vehicles that are far away from earth, thus localized processing is needed. Memory for processing space must be radiation hard, be a dense memory for AI applications, have good data retention in the case of power loss and environmental instabilities, high endurance that the capability to work in a wide temperature range. Numen’s chips use TSMC’s 22nm MRAM process.
Numen’s memory architecture includes high-speed SRAM for video streaming and low power and small area dense memory using NuRAM (MRAM) and flow control between the high speed SRAM and NuRAM. This architecture is shown in the figure below. Using MRAM rather than SRAM shrinks the coefficient memory storage for a DNN by 2.5-3.5 times or provides much more memory in the same area as an SRAM solution. MRAM also reduces power requirements (more than 20X lower standby power) and reduces latency.
MRAM and other persistent memories, such as 3D-XPoint, a type of phase change memory that is available as SSDs or DIMM memory modules from Intel and Micron. The 2020 Emerging Memories Find their Direction report projects that emerging memories will be over a $36B market by 2030 with significant growth in memory capacity shipments as shown in the chart below.
An interesting development associated with the FMS was a press release and keynote talk by Microsoft on the formation of an alliance to advance data storage using DNA. Twist, Illumina and Western Digital announced the formation of an alliance with Microsoft to advance the field of DNA data storage. They plan to create a comprehensive industry roadmap to help the industry achieve interoperability between solutions and help establish the foundations for a cost -effective commercial archival storage ecosystem using DNA. DNA offers the possibility of a very dense digital storage media, that is stable over long periods of time.
Writing and reading data from DNA has been demonstrated. Microsoft’s Karin Strauss said that in collaboration with the University of Washington, they have demonstrated a fully automated end-to-end system capable of storing and retrieving data from DNA and have stored 1GB of data in DNA synthesized by Twist and recovered data from it. In addition to the announcing members a number of other companies involved in DNA storage development have joined the alliance.
The 2020 Flash Memory Summit included interesting announcements on NVMe and NVMe-oF as well as ethernet-based SSDs, increasing use of QLC NAND flash, new embedded MRAM announcements and even an announcement of a DNA storage consortium.