Top Stories | Go Semi and Beyond

Home » Top Stories

Semiconductor Test – Toward a Data-Driven Future

By Keith Schaub, Vice President, Marketing and Business Development, Applied Research and Technology Group, Advantest America

Integrating new and emerging technologies into Advantest’s offerings is vital to ensuring we are on top of future requirements so that we are continually expanding the value we provide to our customers. Industry 4.0 is changing the way we live and work, as well as how we interact with each other and our environment.

This article will look at some key trends driving this new Industry 4.0 era – how they evolved and where they’re headed. Then, we’ll highlight some use cases that could become part of semiconductor test as it drives towards a data-driven future.

The past

To understand where we’re headed, we need to understand where we’ve been. In the past, we tested to collect data (and we still do today). We’ve accomplished tremendous things – optimized test-cell automation, gathered and analyzed yield learnings, process drift and statistical information, to name a few. But we ran into limitations.

For instance, we lacked tools necessary to make full use of the data. Data is often siloed, or disconnected. Moreover, it’s not in a useful format, so you can’t take data from one insertion and use it in another insertion. Not having a way to utilize data for multiple insertions reduces its value. Sometimes, we were simply missing high-value data, or collecting and testing the wrong type of data.

The future

Moving forward, we think what we are going to see is, the data that we collect will drive the way that we test. Siloed data systems will start to be connected, so that we can move data quickly and seamlessly from one insertion to another – feeding the data both forward and backward – as we move further forward into Industry 4.0. This will allow us to tie all of the different datasets from the test-chain together, from wafer, from package, from system-level test. All of this data will be very large (terabytes and petabytes), and when we apply artificial intelligence (AI) techniques to the data, we’ll gain new insights and new intelligence that will help guide us as to what and where we should be testing.

We’ll ask new questions we hadn’t thought to ask before, as well as explore long-standing questions. For example, one dilemma we’ve faced for years is how best to optimize the entire test flow, from inception to the end of the test cycle. Should the test be performed earlier? Later? Is the test valuable, or should it come out? Do we need more tests? How much testing do we need to do to achieve the quality metric that we’re shooting for? In the Industry 4.0 era, we’ll start seeing the answers to these questions that resonate throughout the test world.

Data…and more data

Today, thanks to the convergence of data lakes and streams, we have more data available to us than ever before. In the last two years alone, we’ve generated more data than in all human history, and this trend will only increase. According to some estimates, in the next few years, we will be generating 44 exabytes per day. In other words, this would be about 5 billion DVDs worth of data per day. Stacked up, those DVDs would be higher than 36,700 Washington Monuments, and the data they contain would circle the globe in about a week (see Figure 1).

Figure 1. The volume of data we generate will soon reach 44 exabytes, or 5 billion DVDs, per day. Since this amount of data could circle the earth in about seven days, an “earth byte” could equate to a week’s worth of data.

These kinds of numbers are so massive that the term “Big Data” doesn’t really suffice. We need a global image to help visualize just how much data we will be generating on a daily basis. Based on these numbers, we could begin using the term “earth byte” to describe how much data is generated per week. Regardless of what we call it, it’s an unprecedented amount of data, and it is the fuel behind Industry 4.0.

Industry 4.0 pillars

Five key pillars are driving and sustaining the Industry 4.0 era (Figure 2):

Big Data – as noted above, we are generating an unprecedented and near-infinite amount of data, half comes from our cell phones and much of the rest from the IoT
IoT – sensor-rich and fully connected, the IoT is generating a wealth of data related to monitoring our environment – temperature, humidity, location, etc.
5G – the 5G global wireless infrastructure will enable near-zero-latency access to all of the data being generated
Cloud computing – allows us to easily and efficiently store and access all our earth bytes of data
AI – we need AI techniques (machine learning, data learning) to analyze in real time these large datasets being sent to the cloud in order to produce high-value, actionable insights

Figure 2. The five key pillars of Industry 4.0 are all interconnected and interdependent.

Because they are all reinforcing and accelerating each other, these Industry 4.0 trends are driving entire industries and new business models, creating an environment and level of activity that’s unprecedented.

Hypothetical use cases

Now that we’ve looked at where the test industry has been and what is driving where we’re headed, let’s examine some theoretical use cases (grounded in reality) that provide a visionary snapshot of ways we may be able to leverage the Industry 4.0 era to heighten and improve the test function and customers’ results. Figure 3 provides a snapshot of these five use cases.

Figure 3. Industry 4.0 will enable advancements in many areas of the test business.

1) Understanding customers better – across the supply chain

This use case encompasses various customer-related aspects that Industry 4.0 will enable us to understand and tie together to create new solutions. These include:

- Customers’ march toward and beyond 5nm and how wafer, package, and system-level testing will work together for them
- The entire supply chain’s cost challenges, which will help us optimize products and services across the value chain
- How automotive quality requirements are driving into other business segments – as autonomous vehicles will be connected to everything across 5G, the quality of the connected network and its components will be forced to improve
- 5G’s advanced technologies, including phased arrays, over-the-air, and millimeter-wave, all of which are already mature in the aerospace and military sectors – we will need to be able to leverage those technologies, cost them down appropriately, and support them for high-volume testing

2) Decision making – yield prediction
The ability to predict yields will change everything. If you know, based on historical process data, that you’ll experience a yield drop within the next one to two months, you can start additional wafers to offset the drop. This easy fix would enable very little disruption to the supply chain.

If you can solve this problem, however, the next obvious question is, what’s causing it? Why don’t I just fix it before it happens? This involves prescriptive analytics, which will follow predictive analytics. Say you have developed a new generation of a product. You’ve collected yield data at all test insertions for previous generations of the product, which share DNA with the new incarnation. Combining past data with present data creates a model that enables highly accurate predictions about how the wafer will perform as it moves through the supply chain.

3) Creating new customer value – predictive maintenance
This use case is the most likely to come to fruition in the near term. Maintenance contracts require carrying inventory, spare parts and myriad logistics – they represent a huge cost. Soon, by combining tester fleet data with customer data and implementing machine learning, we’ll be able to dramatically improve tester availability, reduce planned maintenance, and decrease losses due to service interruptions. This will allow us to replace modules before they fail.

Predictive maintenance is a proven parameter that’s already being used in other industries such as oil and gas manufacturing. IoT sensor arrays are applied to the huge pipes and pumps controlling flow of chemicals, measuring stress, flow rates, and other parameters. The data from these sensors predict when a pump is going to wear out or a pipe needs to be replaced before it fails. We can leverage, redefine and redeploy this implementation for our use case. Soon, a field service engineer could show up with a replacement module before you even know that you need it.

4) Monetization – using data in new ways to drive our business
Data is an asset, and we’ll start to derive new business on sharing access, or leasing use of our data assets. One example might be a tester digital twin that resides in the cloud. Imagine that customers’ chip model data could be fed into this digital twin as a kind of virtual insertion, and the outputs would be parameters such as performance and yield. Customer benefits would include optimized programs, recommended tests, and predicted test coverage at each virtual insertion. This would enable them to optimize the entire flow depending on the product life cycle – perhaps test order could be changed, or a test added in order to improve quality. Because Advantest owns all the data that comes from our testers, we could lease or sell chipmakers access to the data, creating a significant business opportunity.

5) Automating and improving business operations – driving efficiencies
The test engineering community struggles with finding ways to improve process efficiencies. One way to do this is with the use of intelligent assistants. Still in their infancy, this category of AI can best be described as a trained assistant that could guide you in a helpful way when trying to perform a task.

For example, say we are validating a new 5G RF product on our Wave Scale RF card on the V93000 tester. All the pieces are being brought together – load board, tester, socket, chip, test program – and if there are any problems, the whole thing won’t work, or you’ll get partial functionality. An intelligent assistant or ‘bot’ trained in the necessary skillset can dynamically monitor the inputs and outputs and engineers’ interactions and provide real-time suggestions or recommendations on how to resolve the issues. At first it won’t be smart, but will learn quickly from the volume of data and will improve its recommendations over time.

As you can see, AI’s potential is vast. It will touch all aspects of our lives, but at its core, AI is really just another tool. Just as the computer age revolutionized our lives in the ’80s and ’90s, AI and Big Data will disrupt every industry we can think of – and some we haven’t yet imagined. Those slow to adopt AI as a tool risk being left behind, while those that embrace AI and learn to fully utilize it for their industries will be the future leaders and visionaries of Industry 5.0, whatever that may be.

Did you enjoy this article? Subscribe to GOSEMI AND BEYOND

Posted in Top Stories

Overlapping Speech Transcription Could Help Contend with ATE Complexity

By Keith Schaub, Vice President of Business Development for US Applied Research & Technology, Advantest America Inc.

Introduction

Increasingly complex chipsets are driving corresponding increases in semiconductor test system hardware and software. Artificial intelligence offers innovative, ingenious opportunities to mitigate the challenges that test engineers and test-system operators face and to improve security and traceability. Advantest, which fields thousands of test systems worldwide that test billions of devices per year, is studying several ways in which AI can help.

Initial work has involved facial recognition and overlapping speech transcription (the latter being the focus of this article), both of which can reduce the need for a mouse and keyboard interface. With a mouse and keyboard, operators can leave themselves logged in when other operators take over, creating security vulnerabilities and making it difficult, for example, to trace which operator was on duty during a subsequently detected yield-limiting event. A voice-recognition system could facilitate identifying which operators gave which commands.

Industrial cocktail-party problem

Implementing a voice-recognition system in a test lab or production floor presents its own challenges, with air-cooled systems’ fans whirring and multiple teams of engineers and operators conversing—creating an industrial version of the cocktail-party problem.

To address this problem, Advantest has developed fast, multi-speaker transcription system that accurately transcribes speech and labels the speakers.

The three main steps in the transcription process include speaker separation, speaker labeling, and transcription. For the first step, a real-time GPU-based TensorFlow implementation of the deep-clustering model recently developed by Mitsubishi¹ separates the mixed-source audio into discrete individual-speaker audio streams. A matrix of audio-frequency domain vectors obtained by the short-time Fourier Transform (STFT) serves as the input to this model. The model learns feature transformations called embeddings using an unsupervised, auto-associative, deep network structure followed by a traditional k-means clustering method (recent implementations have shown significant improvements over traditional spectral methods) that output the clusters used to generate single-speaker audio.

The second step involves an implementation of Fisher Linear Semi-Discriminant Analysis (FLD)² for an accurate diarization process to label the speakers for each audio stream that the clustering model generated in the separation step. The third and final step makes use of the Google Cloud speech-to-text API to transcribe the audio streams, assigning a speaker based on the diarization step.

Figure 1: This system-flow diagram illustrates the steps in the overlapping speech-transcription process, from the audio input to the labeling of the speakers.

Figure 1 illustrates the system flow of the entire process. During the first step, the clustering separates the audio. The spectrogram of the mixed and separated audio (Figure 2) makes it easy to visualize the separation taking place.

Figure 2: A view of the spectrogram of the mixed and separated audio helps illustrate how the separation takes place.

Testing the model

We tested the model on the TED-LIUM Corpus Release 3,³ which is a collection of TED Talk audio and time-aligned transcriptions. To measure the system accuracy, we compared our system-generated transcriptions to the ground-truth transcriptions using Word Error Rate (WER), denoted by the proportion of word substitutions, insertions, and deletions incurred by the system. Our system demonstrated a WER of 26% versus a ground-truth WER of approximately 14%. Overall, the generated transcripts were largely intelligible, as shown by the following example:

Actual Audio

“Most recent work, what I and my colleagues did, was put 32 people who were madly in love into a function MRI brain scanner, 17 who were. . .”

System Transcription

“Most recent work but I am my colleagues did was put 32 people who are madly in love into a functional MRI brain scanner 17 Hoover.”

As shown, the results are largely readable, even with the current word error rate.

Often, the audio output from the Separation Step contains many artifacts, which lead to outputs readily understood by humans but that are more difficult for current speech-to-text converters. Thus, we get an output like this:

Actual Audio

“Brain just like with you and me. But, anyway, not only does this person take on special meaning, you focus your attention on them…”

System Transcription

“Brain, it’s like with your and name. But anyway, I don’t leave something special meeting. I’m still get your attention from you a Grande, AZ them…”

Thus, when the clustering algorithm becomes unstable, the transcription is also erroneous. However, many of these errors can likely be fixed in future work.

Overall, overlapping speech has presented a daunting problem for many applications including automated transcription and diarization. But recent innovations in learned-embeddings for speaker segmentations make it possible to produce accurate, real-time transcription of overlapping speech. The clustering model is the most computationally expensive step, but because it is implemented in TensorFlow and it is GPU-optimized, the system can run in real time. In short, recent research in learned embeddings allows for higher accuracy transcription of overlapping speaker audio.

Nevertheless, implementations of such systems are currently very limited due to relatively low accuracy, which we believe is likely the result of the clustering model using binary (discrete) masks¹ to output the audio of each speaker. We will investigate continuous masking to further improve the audio quality well enough to be used for live transcription for live events.

Virtual engineering assistant for ATE

Ultimately, we envision AI techniques such as overlapping speech transcription to be useful in developing an AI-based engineering assistant for ATE, as outlined in a presentation at the 2018 International Test Conference. In the high-decibel environment of the test floor, overlapping speech transcription could help solve the cocktail-party problem, allowing the virtual assistant—a test engineering equivalent of Iron Man J.A.R.V.I.S—to respond to one particular engineer or operator.

Overlapping speech transcription is just one way of interacting with such an assistant. At Advantest, we have also experimented with facial recognition, using software that can create what is essentially a “face fingerprint” from one image, eliminating the need of traditional networks for thousands of images for training. We have found that the technology performs well at a variety of angles (photographing the subject from 30 degrees left or right, for example) and at a variety of distances (image sizes). Eventually, such technology might enable the virtual assistant to proactively intervene when recognizing a look of frustration on an engineer’s face and intuiting what information may be helpful in solving the problem at hand.

Beyond speech-transcription and facial-recognition capabilities, a virtual engineering assistant would embody a wealth of highly specialized domain knowledge, with many cognitive agents offering expertise extending from RF device test to load-board design. Such an assistant would be well versed in test-system features that might only be occasionally required over the long lifetime of expensive equipment with a steep learning curve. Ultimately, such an assistant could exhibit intuition, just as do game-playing AI machines that have mastered “perfect information” games like checkers and chess and have become competitive at games like poker, with imperfect information and the ability to bluff. Although computers haven’t traditionally thought to be intuitive, it might turn out that intuition evolves from deep and highly specialized knowledge of a specific domain.

References

1. Hershey, John R., et al., “Deep Clustering: Discriminative Embeddings for Segmentation and Separation,” 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016. https://ieeexplore.ieee.org/document/7471631

2. Giannakopoulos, Theodoros, and Sergios Petridis, “Fisher Linear Semi-Discriminant Analysis for Speaker Diarization,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 7, 2012, pp. 1913-1922. https://ieeexplore.ieee.org/document/6171836

3. Hernandez, François, et al., “TED-LIUM 3: Twice as Much Data and Corpus Repartition for Experiments on Speaker Adaptation,” Speech and Computer Lecture Notes in Computer Science, 2018, pp. 198-208. https://arxiv.org/abs/1805.04699

Did you enjoy this article? Subscribe to GOSEMI AND BEYOND

Posted in Top Stories

Flexible Automation Infrastructure Supports Continuous Test-program Integration and Delivery

By Stefan Zügner, V93000 Product Manager, Jan van Eyck, Product Owner SW R&D, Kheng How, Senior Staff Software Engineer and Daniel Blank, Senior Application Consultant Center of Expertise

Test engineers today are facing many challenges working within a collaborative test-program-development environment. Fortunately, a concept called continuous integration, or CI, can be implemented within the test-program development process to help meet these challenges.

Today, each engineer is likely to be part of a team of many engineers working on different parts of the same test program concurrently. Furthermore, test engineers developing IP blocks may be spread across widely scattered geographical locations.

The result of the complexity is that developers issue multiple program changes (commits) every day. Each commit changes the test program, and any commit may break the test program. Consequently, at any given time, the overall quality of the test program may be unknown, and problems can require significant time and resources to discover, debug, and fix. In addition, the longer it takes to discover a bug, the more time and expense it takes to fix it.

Continuous integration addresses today’s test challenges

Collaborative development typically relies on an existing source-code management system (for example, git or SVN) for tracking changes, but by the time an integrator discovers issues, it is usually too late to fix them in an efficient and timely manner. With continuous integration tooling it is now possible to trigger validation tests in an automated manner whenever changes are committed to the source-code-management repository, allowing for frequent integration and timely checks without additional overhead for the individual developer (as illustrated in Figure 1). Continuous delivery in addition automates the release-to-production process and therefore allows for new test-program releases essentially at any point in time.

Figure 1: The continuous integration workflow embraces automated validation tests for each change to a test program.

The difference between traditional and continuous integration processes is illustrated in detail in Figure 2. With the traditional process, different engineers (Alice, Bob, and Charlie in Figure 2) independently develop sections of a test program, and yet another engineer (David) performs integration and test just before the program’s release. If David’s test finds a bug, deadlines could be at significant risk because of the time it may take Alice, Bob, or Charlie to debug and rework their code and resubmit it to David for further integration and test.

Figure 2: A traditional development process (top) can put release deadlines at risk. In contrast, a continuous integration process (bottom) reduces time-to-market and time-to-quality.

The continuous-integration process, in contrast, delivers continuous and systematic validation throughout the entire development cycle, providing immediate feedback to engineers such as Alice, Bob, and Charlie on the quality of their commits. The immediate feedback made possible through continuous integration reduces both time-to-market and time-to-quality. In addition, the automated test-program validation process includes programmatic checks, which allow engineering teams to establish quality processes in a repeatable manner.

Tools for continuous integration

Several software tools can serve in a continuous-integration system.¹ One of them is Jenkins (https://jenkins.io/), an open-source and widely used automation tool with support from an active community that makes information widely available on the web.

Jenkins is extensible and contains comprehensive plugins for functions such as source-code management (for example, git and SVN) and email notification.

In an implementation in which Jenkins is employed in a continuous-integration system (Figure 3), every commit can trigger the automatic running of validation jobs offline or online according to the job setup. The system stores and manages execution logs and test results while sending out notifications and reports on each execution. With the continuous-automation system automating test-program validation, test-program developers can focus on development.

Figure 3: A traditional development process (top) can put release deadlines at risk. In contrast, a continuous integration process (bottom) reduces time-to-market and time-to-quality.

Adding Smart CI to SmarTest 8

For semiconductor test-program development, Advantest offers its SmarTest 8 software for the V93000 platform.² SmarTest 8 builds on previous versions to offer fast test-program development, efficient debug and characterization, high throughput due to automated optimization, faster time to market, ease of test-block reuse, and efficient collaboration.

To support continuous integration and delivery for test-program development in the SmarTest 8 environment, Advantest offers the Smart CI solution. The Smart CI solution includes the Smart CI custom Jenkins server plugin, which is tailored for SmarTest 8. The plugin offers simple validation job setup through “fill-in-the-blanks” forms, and it supports freestyle (GUI-based, one client) and pipeline (script-based, distributed single validation job on multiple clients) setups.

Tightly integrated with the plugin is the Smart CI Client for SmarTest 8, which provides a command line interface (CLI) to enable continuous integration and delivery for SmarTest 8 test programs. Smart CI Client can also be used for other CI solutions not incorporating Jenkins.

Also included in the Advantest Smart CI solution are Docker images for each individual SmarTest 8 release, allowing for a simplified Smart CI application. The Docker images offer preconfigured setup and enable virtual-machine (VM) and cloud installations. Multiple SmarTest 8 versions and offline jobs can also be run on the same workstation concurrently.

Smart CI works out-of-the-box

Smart CI works out of the box. Just enter a test-program name, and the program compiles, loads, and executes, comparing results against datalog and throughput references.

In addition to working out of the box, Smart CI offers Advantest templates that can be adapted with low to medium effort by a lead test engineer to validate a test program with customer-specific checker scripts. Customized results are available via offline execution.

Beyond continuous integration as enabled by Smart CI today, Advantest’s roadmap calls for the future implementation of continuous delivery, in which a test program (optionally encrypted) can be exported for production, and test-program validation can take place in a production environment, including a test cell. As such Smart CI will also offer an integration with built-in or custom release checkers of TP360.³ TP360 is a software package that helps V93000 customers increase test-program development efficiency, optimize test-program quality and throughput, reduce cost of test, and increase test-program release and correlation efficiency. TP360 is based on an open framework that enables users to add new applications easily and flexibly.

As does continuous integration, continuous delivery will work out of the box—an engineer need only enter a test-program name.

Conclusion

In summary, Smart CI enables automated continuous integration and delivery for SmarTest 8, saving test-program development time and effort and boosting engineering capacity by 10% to 15%. Smart CI ensures test-program quality through fully automated and systematic test-program quality checks throughout the entire development cycle, and it enables the release of runtime-ready test programs at any time. Furthermore, it fosters discipline in engineering teams, enabling team members to consistently deliver high quality, and it provides clear project status reports anytime, thereby increasing manageability and predictability. Smart CI Docker images simplify installation and maintenance, the Advantest Jenkins server plugin supports easy validation job setup, and Advantest provides comprehensive support and continuous enhancements.

References

1. “Comparison of continuous integration software,” Wikipedia. https://en.wikipedia.org/wiki/Comparison_of_continuous_integration_software

2. Donners, Rainer, “A Smarter SmarTest: ATE Software for the Next Generation of Electronics,” GO SEMI & BEYOND, August 3, 2017. https://www.gosemiandbeyond.com/a-smarter-smartest-ate-software-for-the-next-generation-of-electronics/

3. Zhang, Zu-Liang, “TP360—Test Program 360,” Video, VOICE 2013. https://vimeo.com/80319228

Did you enjoy this article? Subscribe to GOSEMI AND BEYOND

Posted in Top Stories

PCB Design Topologies for 5G and WiGig ATE Applications

The following article is an adapted excerpt of a DesignCon 2019 Best Paper Award winner. The full paper is included in the conference proceedings, which can be purchased here.

By Giovani Bianchi, Senior Staff Engineer, and José Moreira, R&D Engineer, Advantest Corp.; and Alexander Quint, Student, Kalrsruhe Institute für Technologie, Germany

The opening of the millimeter-wave (mmWave) spectrum to the next generation of mobile communications introduces mmWave-based communications to the consumer arena. This new generation includes 5G and WiGig [Wi-Fi-certified 60MHz communications]. From a test engineering point of view, mmWave communications require a significant jump in testing frequencies from the maximum of 6 gigahertz (GHz) used for LTE applications to frequencies as high as 44GHz for 5G and 72 GHz for WiGig.

In addition, these new applications use phased array antennas, which means there are many more radio-frequency (RF) ports that need to be tested compared to LTE applications. At the same time, the same cost-of-test pressure for consumer applications applies to testing these new mmWave integrated circuits (ICs).

While mmWave applications pose a variety of new challenges for the automated test equipment (ATE) test engineering community, this article concentrates on a specific topic: the design of printed circuit board (PCB) combiners/dividers that can aggregate multiple RF ports into a single measurement/stimulus port. This can be vital for reducing cost of test. Deciding to use a combiner/divider will be highly dependent on the target application, testing phase (e.g., initial characterization or high-volume production), available ATE resources and test strategy.

What is a combiner/divider?

In general, power combiners/dividers are passive N-port networks (N ≥ 3). They can be used as power dividers to split the power of an input signal into two or more output signals, or they can be used to combine multiple input signals to one output signal of higher power [1]. In this case, the power divider is called a power combiner. The input of a power divider is the output of a power combiner and vice versa.

The easiest way to build a three-port power divider is shown in Figure 1. It consists of a T-junction and a susceptance, which represents discontinuities in the junction. A reciprocal three-port network (which a power divider is) can never be lossless and matched at all ports [1].

Therefore, to be matched at all ports, resistive elements must be added to the power divider.

Figure 1: General schematic of a power divider.

Wilkinson combiner/divider

Another disadvantage of the simple power divider shown in Figure 1 is that the two output ports are not isolated against each other. To obtain isolation between the output ports, Wilkinson power dividers [1,2] are used. A two-way Wilkinson power divider schematic is shown in Figure 2.

Figure 2: Schematic of Wilkinson power divider.

The two quarter-wave transformers provide good input matching, whereas the resistor between the two outputs provides good isolation between the output ports. If the output ports are both matched, the Wilkinson power divider even appears lossless because there is no current flowing through the resistor. Wilkinson power dividers can be designed for multiple outputs; can have unequal power ratios; and can be extended using multiple sections to achieve higher bandwidth.

Wilkinson power dividers well suited for frequencies in the range of 20-40GHz. However, power dividers with resistors are difficult to use in frequency ranges above 50GHz. For these scenarios, the better choice is a hybrid ring.

Hybrid ring or rat-race combiner/divider

While it is not possible to build lossless three-port networks that are matched at all ports, it is possible to do so with four-port networks. One easy way to realize a four-port power divider is with a hybrid 180° coupler, such as the hybrid ring or rat-race example shown in Figure 3 [1].

If port 1 or 3 is used as an input port, the output ports (2,3 or 1,4 respectively) are in phase and port 4 or 2, respectively, is isolated. If port 2 or 4 is used as an input port, the output ports (1,4 or 2,3 respectively) are shifted by 180°, respectively isolating port 3 or 1. A hybrid ring can also be used as a power combiner with two inputs and one sum and one difference output. For example, if ports 2 and 3 are used as input ports, port 1 is the sum output and port 4 is the difference output.

Figure 3: Hybrid ring structure

ATE test fixture challenges

ATE PCB test fixtures used in mmWave applications face some key challenges, as described below.

PCB size and thickness. Figure 4 shows a typical multi-site ATE PCB test fixture for high-volume production testing of an LTE-related RF device (below 6GHz). These PCBs are very large (e.g., 516.8mm x 600mm) and thick—a minimum thickness of 3.5mm is required for some ATE platforms, with stack-up thickness reaching 5mm or higher. In addition, while multi-site setups are necessary for parallel testing (essential for reducing cost of test), handler requirements can cause the pitch between the devices under test (DUTs) to be very tight.

Figure 4: Example of ATE test fixture with eight sites for high-volume testing
of an RF integrated circuit (<6GHz). Courtesy of Spreadtrum.

Small manufacturing volumes. Compared with higher-volume PCB applications, the manufacturing volume for an ATE PCB test fixture can be as small as only one or two boards at the start of a project. In addition to their size and complexity, this means these boards will be relatively expensive.

DUT pitch. Currently, ball grid array (BGA) pitches for mmWave applications can be less than 0.4mm. This small pitch, coupled with the large, thick PCB test fixture, further complicates manufacturing. It will also create mechanical restrictions on any combiner/divider designs one needs to implement to connect to the DUT BGA pads.

Dielectric material. High-performance dielectric materials have been the default choice in mmWave applications, but cannot be used for the large, high-layer-count PCBs typical of ATE applications. Ideally, traditional high-performance materials that are already in use for ATE test fixture applications can be deployed. Hybrid stack-ups can also be used, with a high-performance RF material in the outer layers and standard FR4 in the inner layers, but this should be discussed in detail with the PCB test fixture fab house.

The type of fiber weave used is also a critical determinant of dielectric material. Typical high-performance RF materials for ATE PCB test fixture manufacturing use a glass weave. This glass weave will have an impact on dielectric loss, dielectric constant, and, ultimately, signal delay. This is now important because mmWave application usually use phase array antennas, so the phase of each element is critical. On the PCB test fixture, it is important that the phase delay of all interconnects to the antenna ports is the same. To minimize differences in dielectric constant, either a spread glass fiber weave type can be used, or the PCB test fixture can be rotated 10 degrees on the manufactured panel.

Microstrip copper profile and plating. For traditional RF applications (< 6GHz), only the skin effect and dielectric loss were important, but for mmWave applications the surface roughness loss becomes important [4,5]. This means that when selecting the dielectric material, one also needs to consider the type of copper profile to be used, taking into account the manufacturing and reliability requirements of the PCB test fixture. Choosing a very low-profile copper, for example, may make sense for loss mitigation, but if the PCB fab house cannot guarantee its reliability for the specific requirements of the ATE PCB test fixture, it may generate other problems and should not be used.

Implementing a Wilkinson combiner [for complete analysis and all figures, please see full paper]

As mentioned, ATE PCB test fixtures present significant challenges due to their size and requirements. Although there are off-the-shelf combiners/dividers with excellent performance, especially for the 5G frequency range, they use materials and implementation techniques that are not viable for an ATE PCB test fixture. PCB size and the need for a multilayer implementation limit the types of possible approaches. Also, the large number of ports required for mmWave applications, coupled with the need to test multiple DUTs in parallel, requires that the combiner/divider structures be small and omit processes incompatible with standard ATE PCB test fixture assembly.

Figure 5 shows three examples of a two-way Wilkinson combiner/divider element targeted for 5G applications (target design was 20-40 GHz), chosen for their implementation simplicity and small size. Example 1 is the easiest layout for a Wilkinson power divider. The quarter-wave transformers are curved to reduce coupling between them. In Example 2, there is only a small modification where a short line with a different width in front of the quarter-wave transformers is used to further improve the input matching. Example 3 is the most complex since it consists of two power divider stages. In general, this type has a higher bandwidth than a single-stage Wilkinson divider.

Figure 5: Implementation examples of single two-way Wilkinson combiners/dividers.

The key metrics when evaluating a combiner/divider are its phase matching (which, with a Wilkinson simulation model, is always perfect), the return loss at each port and the loss matching across the frequency of interest. Tables 1 and 2 show the insertion loss and the return loss of the common port for the three examples at five different frequencies.

Table 1: Comparison of the simulated insertion loss for each Wilkinson example.

Table 2: Comparison of the simulated return loss for each Wilkinson example.

The results show that Example 2 has an overall improved return loss on the common port compared with Example 1. Example 3 shows slightly less variation on the insertion loss compared with the other examples. These differences may seem small when comparing single elements but they do amplify once one begins to aggregate the elements in a more complex Wilkinson combiner (e.g., a 1-to-8 Wilkinson combiner/divider).

Implementing a hybrid ring [for complete analysis and all figures, please see full paper]

As mentioned in the previous section, hybrid ring combiners are able to work at higher frequencies than traditional Wilkinson combiners, but with a smaller bandwidth. This type of design should be targeted for WiGig applications in a frequency band of 56 to 72GHz. Unfortunately, in this frequency range, off-the-shelf combiners in coaxial packages are not easy to come by, although some vendors can create them by special request.

Figure 6 provides two examples of implementing a single two-way hybrid ring combiner/divider element targeted for the WiGig frequency range. The shape of the ring in both examples is not exactly circular due to the rectilinear T-junctions. The layout of Example 2 looks more regular due to the smaller T-junctions, to which trapezoidal tapers have been added, to make the T-junction shorter and to geometrically match the width of the 50-ohm lines.

Figure 6: Implementation examples of single 2-way Hybrid ring combiners/dividers.

Figure 7 shows simulated and measured results for this structure. The connectors were not de-embedded from the measured data, so a full 3D EM simulation was done, including the connector model from signal microwave. The used PCB parameters on the simulation (dielectric constant and loss tangent) were based on tuned values from a previous test coupon using the procedure described in [5]. This is critical to obtain more accurate simulation results.

Figure 7: Results of the hybrid ring test coupon simulation and measurement.

The results show a reasonable correlation to simulation, even when assuming the structure etching was perfect. The target bandwidth of the hybrid ring (56 to 72GHz) was achieved with less than 1 dB measured amplitude imbalance and less than 25 degrees measured phase imbalance in that frequency range.

Conclusion

In looking at combiner/divider design approaches for mmWave applications on ATE systems, special attention is given to the Wilkinson and hybrid ring (rat-race) combiner approaches because they’re more easily implemented on ATE PCB test fixtures. These fixtures present specific challenges that need to be considered in advance when designing a combiner/divider for 5G/WiGig applications. In this context, some of the challenges are new to the ATE test fixture design community. The importance of the 5G/WiGig applications will certainly generate design improvements and new ideas, both for combiner/divider topologies and for PCB manufacturing.

References

[1] D. Pozar, Microwave Engineering, 4^th Edition, Wiley 2011.

[2] E. J. Wilkinson, “An N-Way Hybrid Power Divider,” IRE Transactions on Microwave Theory and Techniques, Vols. MTT-8, pp. 116-118, 1960.

[3] Jose Moreira and Hubert Werkmann, An Engineer’s Guide to Automated Testing of High-Speed Interfaces, 2^nd Edition, Artech House 2016.

[4] Rogers Corporation, “Copper Foil Surface Roughness and its Effect on High Frequency Performance,” PCB West, 2016.

[5] Heidi Barnes, Jose Moreira and Manuel Walz, “Non-Destructive Analysis and EM Model Tuning of PCB Signal Traces Using the Beatty Standard,” DesignCon, 2017.

Posted in Top Stories

Trustable AI for Dependable Systems

By Ira Leventhal, Vice President, Applied Research Technology and Ventures, Advantest America, and Jochen Rivoir, Fellow, Advantest Europe

Interest in implementing artificial intelligence (AI) for a wide range of industries has been growing steadily due to its potential for streamlining functions and delivering time and cost savings. However, in order for electrical and electronic systems utilizing AI to be truly dependable, the AI itself must be trustable.

As Figure 1 shows, dependable systems have a number of shared characteristics: availability, reliability, safety, integrity, and maintainability. This is particularly essential for mission-critical environments such as those illustrated. Users need to be confident that the system will perform the appropriate actions in a given situation, and that it can’t be hacked into, which means the AI needs to be trustable from the ground up. As a test company, we’re looking at what we can do down at the semiconductor level to apply trustable, explainable AI in the process.

Figure 1. Dependable systems are essential for electrical and electronic applications,
particularly those with life-or-death implications. These Photos by Unknown Authors are licensed under CC BY-SA and CC BY-NC.

What is trustable AI?

Currently, much of AI is a black box; we don’t always know why the AI is telling us to do something. Let’s say you’re using AI to determine pass or fail on a test. You need to understand what conditions will cause the test to fail – how much can you trust the results? And how do you deal with errors? You need to understand what’s going on inside the AI, particularly with deep learning models: which errors are critical, which aren’t, and why a decision is made.

A recent, tragic example is the Boeing 737 MAX8 jet. At the end of the day, the crashes that occurred were due to failure of an AI system. The autopilot system was designed, based on the sensor data it was continually monitoring, to engage and prevent stalling at a high angle of attack – all behind the scenes without the pilot knowing it had taken place. The problem was that the system engaged corrective action at the wrong time because it was getting bad data from the sensor. This makes it an explainable failure – technically, the AI algorithm worked the way it was supposed to, but the sensors malfunctioned. Boeing could potentially rebuild confidence in the airplane by explaining what happened and what they’re doing to prevent future disasters – e.g., adding more redundancy, taking data from more sensors, improving pilot training, etc.

But what if a deep learning model was the autopilot rather than the simpler model that acts based on sensor data? Due to the black box nature of deep learning models, it would be difficult to assure the public that the manufacturer knew exactly what caused the problem – the best they could do would be to take what seemed like logical measures to correct the issue, but the system would NOT be trustable.

What does this mean for AI going forward? What are the implications of not having trustable AI? To understand this, we need to look briefly at the evolution of AI.

The “next big thing”… for 70 years

As Figure 2 shows, for seven decades now, AI has been touted as the next big thing. Early on, AI pioneer Alan Turing recognized that a computer equivalent to a child’s brain could be trained to learn and evolve into an adult-like brain, but bringing this to fruition has taken longer that he likely anticipated. During the first 25 years of the AI boom, many demos and prototypes were created to show the power of neural networks, but they couldn’t be used for real-world applications because the hardware was too limited – the early computers were very slow with a minuscule amount of memory. What followed in the 1970s was the first AI winter. The second boom arose in the 1980s and ‘90s around expert systems, and their ability to answer complex questions. The industry created very customized expert-system hardware that was expensive and tough to maintain, and the applications were mediocre, at best. The second AI winter ensued.

*Figure 2. The evolution of AI has been marked by hype cycles, followed by AI winters.* This Photo by Unknown Author is licensed under CC BY-NC-ND

For the past 20 years, AI has enjoyed a fairly steady upward climb due to the confluence of parallel processing, higher memory capacity, and more massive data collection, with data being put into lakes rather than silos to enable better flow of data in and out. Having all these pieces in place has enabled much better algorithms to be created, and Internet of Things (IoT) devices have created a massive, continuous flow of data, aiding in this steady progression.

What this means, however, is that we are currently in the next hype cycle. The main focus of the current hype is autonomous cars, medical applications such as smart pacemakers, and aerospace/defense – all areas with life-and-death implications. We need trustable AI; otherwise, we will not have dependable systems, which will lead to disappointment and the next AI winter. Clearly, we need to avoid this.

AI in the semiconductor industry

With this backdrop, what are some challenges of applying AI within the semiconductor industry?

Fast rate of technological advancement. AI is great for object recognition because it’s learning to recognize things that don’t change that much, e.g., human beings, trees, buildings. But in the semiconductor industry, we see a steady parade of process shrinks and more complex designs that bring new and different failure modes.
Difficult to apply supervised learning due to a lack of labeled training data for these new areas.
High cost of test escapes. If a faulty device is passed and sent out for use in an app – an autonomous driving system, for example – and subsequently fails, the cost could be life and death. Therefore, both risk aversion and the need for certainty are very high.

To meet these challenges requires a different type of AI. A major research focus in the AI community is on developing explainable AI techniques designed to provide greater transparency and interpretability, but these techniques are currently far from fully opening AI model black boxes. Today, our focus is on development of explaining AI. With this approach, we look for opportunities to use machine learning models and algorithms – deep learning, clustering, etc. – to provide insight into the data so that we can make better decisions based on the information. By looking for ways to use AI that have more upside potential for insight, and staying away from those that increase risk, we can create more trustable AI. This will allow us to make semiconductors that operate more accurately, reliably and safely – that is, devices that exhibit all the characteristics associated with dependable systems.

Reduced test time or higher test quality?

If we use deep learning to analyze test results, we may find that we don’t need to do as many tests – for example, 10 tests could replace a previous test flow that required 30 tests, which would greatly reduce test time required. But if the models are imperfect and result in more test escapes, you end up losing dependability for the devices and the systems they go into.

Machine learning exploits correlations between measurements, but every machine learning algorithm makes mistakes. As shown in the table, there are two kinds of risks you can take: a) to remove outliers, risk failing good devices at the expense of yield loss, and lose money; or b) to reduce test time, risk passing bad devices, and lose dependability. Multivariate outlier detection can be used to find additional failures, while deep learning can be employed to detect complex, but well-characterized, failures. Either way, you need explainable decisions.

Explaining AI for engineering applications

Applying AI algorithms to your process requires powerful visualization tools to help you gain further insights into your data. Understanding what the machine learning is telling you will enable you to make decisions based on the data. Let’s take, as an example, machine learning-based debug during post-silicon validation. After your design is complete and you have your first chips, you now want to perform a variety of exploratory measurements on the device to determine that it’s doing what you want it to do.

We are currently researching an innovative approach for applying machine learning in post-silicon validation, as shown in Figure 3:

Generate. Proprietary machine learning algorithms are used to smartly generate a set of constrained random tests that are designed to efficiently find complex relationships and hidden effects.
Execute. The constrained random tests are then executed on the test system. When the results show relationships under certain conditions, we want to zero in on these and find out more about what’s going on in these specific areas. The data collected creates a model of the system.
Analyze. Now that we have our model, we can perform offline analysis, running through a wide range of different I/O combinations and using proprietary machine learning algorithms to analyze the data and determine where there may be effects or issues we need to be aware of.

*Figure 3. Machine-learning based post-silicon validation comprises the three steps shown above.*

In one example, we implemented this machine learning-based process to debug the calibration for a driver chip from one of our pin electronics cards. 500,000 test cases were generated, with all inputs varied, and the results were analyzed to find hidden relationships. Using the resulting model, virtual calibrations of the device were run with varying input conditions and the resulting root-mean-square (RMS) error for each calibration was predicted. The machine learning algorithm uncovered a hidden and unexpected effect of an additional input on the calibrated output. With this input included in the calibration, the RMS error was reduced from approximately 600 microvolts (µV) to under 200µV. When we took the results, including visualizations and plots, back to the IP provider for the chip, they were initially surprised by this unexpected effect, but were able to review the design and find the problem within just one day of obtaining the data. Two key benefits resulted: the calibration was improved, and the IP designer was able to tune the design for future generations of parts.

Another application for explaining AI is fab machine maintenance, where sensor and measurement data are being collected continuously while the machines are running. The question is what we do with the data. With a reactive approach, we’re flying blind, so to speak – we don’t know there’s a problem until we’re alerted to a machine operating out of spec. This creates unexpected downtime and creates problems with trustability and reliability. Far better is to take a predictive approach – ideally, one based not on setting simple conditional triggers alone, but that employs machine learning to look at the data and spot hidden outliers or other complex issues so that a potential problem with a machine is identified long before production problems result. By catching hidden issues – as well as avoiding false alarms – we obtain more trustable results.

The bottom line

Dependable systems require trustability and explainability. Machine learning algorithms hold great promise, but they must be carefully and intelligently applied in order to increase trust and understanding. Explaining AI approaches, which can provide greater insight and help us make better decisions, have many powerful applications in the semiconductor industry.

Posted in Top Stories

TDR with Recursive Modeling Optimizes Advanced-Package FA

by Shang Yang, Ph.D., Senior R&D and Application Engineer, Advantest Corp.

As the range and volume of chips developed for a host of Internet of Things (IoT) applications continues to escalate, conventional failure analysis (FA) techniques are increasingly challenged by the higher input/output (I/O) density and data throughput associated with complex 2.5D and 3D IC packages. These structures are not flat and one-dimensional; they more closely resemble skyscrapers, with many “floors” or layers, as Figure 1 illustrates. In this example, these layers are sitting on a complex foundation of microbumps, interposers and through-silicon vias (TSVs), on top of a laminate material that is attached to the printed circuit board (PCB) using ball grid array (BGA) bumps. This type of complexity makes it increasingly difficult, when conducting FA on the chip structure, to pinpoint the location of a failure from the package level to the die level.

Figure 1: Multidimensional chips, such as the 3D IC package shown here, face significant challenges with respect to performing failure analysis.

Techniques such as x-ray scanning can perform FA on these devices, but these processes are lengthy, which is problematic given the fast time-to-market windows that IoT devices and applications require. For example, if a 5-micron solder bump is determined to be the source of a failure, it is highly challenging to determine whether the crack is on the top or the bottom surface of the bump. Conducting FA by performing x-ray scanning through the entire chip can take up to a few days depending on the chip complexity.

Time-domain reflectometry (TDR) is increasingly being deployed in order to determine the location of the problem more quickly. However, applying TDR analysis for defect characterization inside the die creates its own set of challenges, as this method becomes less accurate if the failure point is between the package-die interface and the transistors. This combination of challenges points to the need for a new approach to TDR.

Effective defect searching

To further aid in understanding why a revised TDR technique is necessary, let’s take a look at a general chip FA process (Figure 2) leverage two kinds of inspection – structural and functional – both of which are needed to debug the defect down to the device level. The first step is conducting a visual inspection by using the human eye or a microscope. Obvious cracks in the chip may be detected and the failure location narrowed down to the package level with approximately 1000-micron resolution.

Figure 2: Structural and functional inspection techniques are both necessary for failure analysis, but a gap exists on the functional side that conventional TDR cannot fill.

Step 2, electrical evaluation, uses an oscilloscope or curve tracer to verify the functionality of each pin. At this point, the failure location may be further narrowed down to the pin level with resolution of about 300 microns. Next, using TDR, x-ray or ultrasonic imaging, the failure point is further investigated at the interconnect level, down to a resolution of around 100 microns.

While there are a number of powerful tools that can conduct further structural inspection and analysis at the die level, a large gap exists between functional inspection steps 3 and 4, as the figure illustrates. If the density of devices inside the 100-micron scale is very high, conducting step 4 efficiently and getting down to the submicron device level for FA becomes highly difficult. Further complicating the matter is that functional solutions are faster with lower accuracy whereas structural methods are more accurate, but take much longer. A high-resolution TDR system that can deliver accurate results quickly is needed to fill this gap.

TS9000 TDR enables high-res die-level accuracy

Advantest has addressed these challenges by developing a TDR option for its TS9000 terahertz analysis system to achieve real-time analysis with ultra-high measurement resolution. The TS9000 TDR Option relies on Advantest’s TDR measurement technology to pinpoint and map circuit defects utilizing short-pulse signal processing. Figure 3 shows the difference between conventional TDR and the Advantest approach.

Figure 3. Conventional TDR is intrinsically a high-noise, high-jitter process. High-res TDR with the TS9000 option replaces the sampler and step source with photoconductive receptors, enabling low noise and very low jitter.

Using laser-based pulse generation and detection, the Advantest solution delivers impulse-based TDR analysis with ultra-low jitter, high spatial precision of less than 5 microns, and a maximum measurement range of 300mm, including internal circuitry used in TSVs and interposers.

Having a high-resolution TDR solution alone does not guarantee the ability to detect the defect all the way down to the design level. Another problem is signal loss – if it is very high, it will have two effects on the front-end-of-line reflected pulse: the pulse will have reduced amplitude and large spread. This makes it difficult to pinpoint the specific defect location.

Recursive modeling (see Figure 4) simulates “removing” all the layers to enable virtual probing at the desired level without destroying the device or being hampered by the hurdles that conventional FA techniques present. This overcomes the challenge of the probe point not always being available due to probes’ minimum pad size requirement and limited accessibility to points far inside the die. The probe can move down layer by layer, de-embedding each trace and recursively measuring the signal pulse, until the defect point can be clearly observed and characterized in the TDR waveform until the interface before FEOL.

This impulse-based TDR approach has proven to be a highly effective method for quickly localizing failure points in 2.5D/3D chip packages, with ultra-high resolution. The recursive modeling technique described, when implemented with the Advantest TS9000 TDR, can greatly increase the strength of the reflected signal and reduce the spread effect to ensure high-accuracy defect detection.

Figure 4. In recursive modeling, the layers of the device can be virtually peeled away like an onion and probing conducted far inside the die to determine a defect’s nature and location.

« Older Entries

Next Entries »