By Anand Thiruvengadam and Guy Cortez
Memories are everywhere in modern electronics. Discrete memory chips take up a lot of space on printed circuit boards (PCBs). Embedded memories consume much of the floor plan in system-on-chip (SoC) devices. Many multi-chip chip configurations, including 2.5D/3DIC devices, are driven by the need for faster memory access. Memory design and verification is an important part of many projects.
Safety-critical applications such as autonomous vehicles, space systems, implanted medical devices, and nuclear power plants are no exception. The integrated circuits powering these applications contain a lot of memory, and the memory technology used must meet the same high standards of reliability and functional safety as the rest of the electronics. The memory development is neither an all-digital design nor an all-handmade analog circuit. It has its own challenges and its own solutions.
The context of memory reliability is also important. Today’s electronic systems demand high memory bandwidth, fast throughput, and low latency. Additionally, generic memory devices are giving way to application-specific chips with stringent power, performance, and area (PPA) requirements. Memories move in a hyper-converged space, composed of several technologies, protocols and architectures in a very complex design.
A recent blog post discussed the increasing “digitalization” in memory development, using digital electronic design (EDA) automation tools to design components at the edge of the core die. Many functional safety techniques developed for digital logic can also be adopted for memories. These techniques meet the requirements of safety standards such as ISO 26262 for road vehicles.
Although the terms “security” and “reliability” are sometimes used interchangeably, the overlap is only partial. Functional safety requires the integration of safety mechanisms to detect faults in electronic devices and react appropriately, as well as calculating that this detection and response produces a high degree of fault coverage. Reliability requires that the risks of a fault appearing be reduced as much as possible in the design and manufacture of silicon.
Safety and reliability must cover the entire life cycle of silicon, from design and verification through laboratory use to production use in the field. In the case of memory designs, the early and late stages of the lifecycle present the greatest reliability challenges. Early chip failures (sometimes called infant mortality) jerk marginal devices, followed by a period (perhaps years) of low-risk operation. As the effects of silicon aging begin to kick in, reliability decreases and failures become more frequent.
The memory development process should include robust static and dynamic analyzes to identify and mitigate potential failures throughout the silicon lifecycle prior to tapeout:
- Early life
- Static checks of analog and digital circuits
- Analog fault simulation
- normal life
- High Sigma Monte Carlo Analysis
- Static Power/Net Signal Resistance Checks
- End of life
- Dynamic analysis by electromigration/drop IR (EMIR)
- Silicon aging analysis
Synopsys PrimeWave Reliability Environment provides a unified workflow around all Synopsys PrimeSim Reliability Analysis reliability analysis technologies and Synopsys PrimeSim Continuum engines to improve productivity and ease of use. The process begins with Synopsys PrimeSim CCK, which extends traditional electrical rule checking (ERC) to the analog domain.
PrimeSim Custom Fault completes digital fault simulation to make functional safety and test coverage analysis practical for complete chips. It even meets the stringent requirements of ISO 26262 and other safety standards for Complex and Comprehensive Failure Modes, Effects, and Diagnostic Analysis (FMEDA).
Synopsys PrimeSim AVA provides high sigma (typically 4-7) Monte Carlo analysis. It uses machine learning (ML) techniques to operate more efficiently while delivering accuracy within 1% of the Synopsys PrimeSim HSPICE circuit simulator. ML reduces the number of runs by orders of magnitude compared to the traditional brute force Monte Carlo simulation approach.
Power/Ground integrity analysis is provided by Synopsys PrimeSim SPRES, which is fast enough to run early in the memory development process. Likewise, Synopsys PrimeSim EMIR offers both high performance and foundry-certified approval accuracy. This analysis covers the power distribution network (PDN) as well as signals in memory design. If problems are discovered, simulation analysis and debugging guidance makes it easier to find and fix the source of potential faults as the silicon ages.
Synopsys PrimeSim MOSRA verifies reliability risks due to the effects of silicon aging. It also offers high performance with foundry-certified precision. When combined with other PrimeSim reliability analysis technologies within the Synopsys PrimeWave reliability environment, memory designers can be confident that their chips will be functionally safe and reliable throughout a lifetime. long and productive silicon.
Finally, the memory reliability solution is complementary to Synopsys’ integrated Silicon Lifecycle Management (SLM) product family, which improves silicon reliability and performance at every phase of the device lifecycle:
- Design phase: Integrate on-chip monitors to provide information about dynamic on-chip environmental conditions in later phases; the reliability analysis takes place in parallel with this phase
- In-Ramp Phase: Focus on achieving acceptable yield before mass production by identifying systematic failures in silicon
- Production phase: continuously monitor and analyze all associated test manufacturing data to maintain high yield and reliability
- In-Field Phase: Monitor device aging and health in real-time while in use and extend device life by optimizing performance and power consumption.
Reliability throughout the lifecycle of silicon is essential for many demanding applications that drive memory design. Only a full spectrum of analysis, tied to a unified flow and certified by foundries, can provide the level of certainty stipulated by industry standards, demanded by customers and required by end users.
— Guy Cortez is senior product manager at Synopsys.