We upload to our blog every couple of weeks, sharing insightful articles from our engineers as well as company news an our opinions on recent industry topics. Subscribe to our mailing list to get great content delivered straight to your inbox.
Timelinks are one of the most challenging aspects of a wafer fab to navigate and significantly increase the complexity of scheduling it. We take a dive into a case study that shows how optimization can be used to manage timelinks to alleviate pressure on bottleneck tools.
Timelinks (also known as time constraints, time lag constraints, time loops or close coupling constraints) are one of the most challenging aspects of a wafer fab to navigate and significantly increase the complexity of scheduling it. We take a dive into a case study that shows how optimization can be used to manage timelinks to alleviate pressure on bottleneck tools.
A timelink is a maximum amount of time that can elapse between two or more consecutive manufacturing process steps of a lot (a group of silicon wafers). Defining timelinks is necessary to mitigate the risk of oxidation and contamination of wafers while waiting between process steps. Violating these timelinks can lead to wafers being scrapped or undergoing a costly rework due to exposure to impurities – ramping up the production costs for a fab.
Managing timelink violations is therefore critical as its impact on cost and chip quality must be balanced with delivery speed (measured by cycle time) and other cost considerations such as leaving tools idle. When it comes to managing timelinks, most fabs look towards heuristics. However, the complexity of time constraints are difficult for a rules-based approach to navigate and require a more advanced scheduling solution. For example, deciding whether or not to dispatch a lot when the subsequent tool has a timelink requires the ability to look at the future state of tools further along the schedule. Alternatively, if you use optimization to generate a schedule then looking into the future becomes a lot more straightforward.
To demonstrate how optimization can help tackle the scheduling of timelinks on bottleneck tools we take a dive into a case study where scenarios are ran using Flexciton’s advanced scheduler. The case study demonstrates how timelinks at a bottleneck tool can be managed by looking several steps into the future and delaying earlier steps to more evenly balance the line.
For illustrative purposes, we consider a small problem with only 33 lots. Each lot has up to 6 remaining steps to be scheduled across 52 tools. There are time constraints of varying duration between all consecutive steps.
The sequence of process steps is defined by the product routes: A, B, C, D and ‘Other’. Routes A, B, and C all end on tool Z which is a diffusion furnace tool that runs batches of five lots at a time. All the timelinks around this tool are relatively tight at around two hours.
We begin by running a production schedule through Flexciton’s scheduler without considering timelinks, where we prioritise minimising cycle time alone (Fig. 1).
When timelinks are included in the schedule, we first eliminate their violations before minimising the cycle time (Fig. 2).
Figure 2 shows the light green lots (the last batch on tool Z) are shifted right on the W toolset to avoid violating timelinks on the next step, and therefore incurring the cost of scrapping wafers or performing rework. This creates a period of idle time on the W toolset and delays the other lots on W. If toolset W was considered in isolation, this solution would be suboptimal. However, when taking into account both toolsets, this provides a far better outcome.
With timelinks met, we also more evenly balance queue times across the line, as demonstrated in Figure 3, which shows total queue time at the final two toolsets used in Route A in order of process steps. Toolset Z still has the highest queue time due to it being the bottleneck, but the difference is substantially reduced. This balancing reduces the bottleneck effect on tool Z.
This case study illustrates how using advanced optimization can help balance the line and reduce the queue time at problematic bottleneck tools. By harnessing the power of optimization we are able to assess the state of tools further down the line – something that isn’t realistically possible with traditional heuristics. This ability to significantly reduce queueing time can go a long way to helping a fab manager to hit KPIs such as the reduction of cycle time whilst avoiding costs incurred from scrapping wafers. However, the problem of scheduling timelinks becomes even more complex when you begin to consider wafers of differing priorities.
Want to learn more? Take a dive into Part 2 of this article where we will be taking a look at how to solve this problem with the added complexity of priority wafers.
Jannik Post – one of our optimization engineers – takes a look at the background of the Reinforcement Learning methodology, before reviewing two recent publications which apply Reinforcement Learning to scheduling problems.
A discipline of Machine Learning called Reinforcement Learning has received much attention recently as a novel way to design system controllers or to solve optimization problems. Today, Jannik Post – one of our optimization engineers – takes a look at the background of the methodology, before reviewing two recent publications which apply Reinforcement Learning to scheduling problems.
Traditionally, semiconductor fabs have relied on real-time dispatching systems to provide their operators with the dispatch decisions – with their ability to show the current state of the work in progress within seconds. These systems may follow rules based on heuristics or derive them from domain knowledge, which makes their design a lengthy process that requires deep knowledge of the fab processes. Maintenance of the contained logic also requires continuous attention from subject matter experts. As well as this, these systems have very limited awareness of the global effects of decisions at toolset level – therefore making them susceptible to providing suboptimal decisions.
More advanced approaches to wafer fab scheduling rely on optimization models, which can take many factors into account, e.g., the effect of dispatching decisions on bottleneck tools further downstream. These solutions will generally require a slightly longer computation time to achieve high-quality solutions.
Reinforcement Learning (RL) promises to avoid the downsides of both common dispatching systems and optimization approaches. So, how does it work? At the heart of RL there is an agent* which performs a task by taking decisions or controlling a system. The goal is to teach this agent to make close to optimal decisions by allowing it to explore different options and providing feedback on the quality of its decision. Good decisions are rewarded whilst suboptimal decisions are punished. Of course, this training will not be performed in a live environment, but rather by simulating thousands of scenarios that might occur to prepare the agent for any possible situation.
A common example of Reinforcement Learning is self-driving cars, but it can easily be seen how it could be productive when used in other environments, such as dispatching in a wafer fab. In theory, it could be utilised to dispatch wafers to tools in a way that optimizes certain KPIs – such as throughput.
Numerous recent publications have explored the use of RL for production control. However, the approaches are still in their early stages and applied to problems much less complex than semiconductor scheduling. Nevertheless, they demonstrate the potential to play a part in future solution strategies. Two approaches stood out to us when reviewing the literature:
“Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning” (2020)
This paper by Zhang et al. describes an approach to designing an agent that generalises its knowledge beyond what it has been trained to do, enabling it to handle unseen problem instances. This is achieved by initially conducting a large amount of diverse training scenarios. The model can flexibly handle instances of different sizes, e.g., with varying numbers of tools.
The agent is first trained on large numbers of scenarios and will thereby learn to exploit common patterns and perform well in instances not encountered before. After the training, the agent can be deployed to solve new instances. As training is conducted separately from solving an instance, the latter can be performed in less than a minute. The performance on benchmarking problems is compared against optimization models and simple dispatching heuristics. The Reinforcement Learning approach yields a makespan – the total duration of the schedule from start to finish – between 10-30% longer than when computed through optimization, but around 30% shorter than what simple heuristics achieve.
“A Reinforcement Learning Environment for Job Shop Scheduling” (2021)
This paper published by Tassel et al. sets out to design a reinforcement learning environment to optimize job shop scheduling (JSS) problems as an alternative to optimization models. The objective in this approach is to reduce periods in the schedule where tools are not in use, which is shown to correlate with a minimisation of makespan. The agent is designed as a dispatcher and is trained on a single scenario at a time by running a real world simulation over and over. As the goal is to generate an optimized solution for the instance, the best solution achieved during training will be saved. Training time and solution time are thus the same in this approach and are limited to 10 minutes to reflect production requirements. In this approach, there is no intention to generalise the behaviour of the agent to other instances.
The authors disclose a makespan of just 10-15% worse than the best known benchmarks for job shop scheduling, and just 6-7% longer than time-constrained optimization approaches.
At Flexciton, we are excited about bringing cutting-edge optimized scheduling to wafer fabs worldwide. We are always exploring new ways that could help us improve the service we provide our customers so it’s exciting to see new emerging technologies which may help solve scheduling challenges in the semiconductor industry. The two publications reviewed in this article both present promising new approaches that yield measurable improvements over simple dispatching heuristics, but still fall short of optimization.
Both approaches can cope with disruption and stochasticity of the environment, such as machine downtimes. Another commonality is that both can readily be applied to problems of different sizes. In both cases the authors respected the requirement for frequent schedule updates (Tassel et al.) and quick decision support (Zhang et al.) and still achieved optimized solutions. It is conceivable that reinforcement learning has the capability to teach an agent to make smart decisions in the present that will improve the future fab state and reduce bottlenecks.
However, as the use of RL for JSS problems is still a novelty, it is not yet at the level of sophistication that the semiconductor industry would require. So far, the approaches can handle standard small problem scenarios but cannot handle flexible problems or batching decisions. Many constraints need to be obeyed in wafer fabs (e.g., timelinks and reticle availability) and it is not easily guaranteed that the agent will adhere to them. The objective set for the agent must be defined ahead of training, which means that any change made afterwards will require a repeat of training before new decisions can be obtained. This is less problematic for solving the instance proposed by Tassel et al., although their approach relies on a specifically modelled reward function which would not easily adapt to changing objectives.
Lastly, machine learning approaches can lead to situations where the decisions taken by the agent will be hidden in a black box. When the insights into the rationale behind decisions are limited, troubleshooting becomes difficult and trust into the solution is hard to establish.
Using wafer fab scheduling to meet KPIs such as increased throughput and reduced cycle time is a challenge that requires a flexible, quick, and robust solution. We have developed advanced mathematical hybrid optimization technology that combines the capabilities of optimization models with the quickness of simple dispatching systems. When needed, the objective parameters and constraints can be adjusted without the need to rewrite or redesign extensive parts of the solution. It can therefore easily be adapted to optimize bottleneck toolsets, a whole fab or even multiple fabs.
Flexciton’s scheduling software produces an optimized schedule every five minutes and easily integrates with existing dispatching systems. The intuitive interface enables users to investigate decisions in a wider context, which helps during troubleshooting and increases trust in the dispatching decisions.
[1] Zhang, Song, Cao, Zhang, Tan, Xu (2020). “Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning.”
[2] Tassel, Gebser, Schekotihin (2021). “A Reinforcement Learning Environment for Job-Shop Scheduling.”
[3] Five reasons why your wafer fab should be using hybrid optimization scheduling (Flexciton Blog)
* – We use the term ‘agent’ to describe a piece of software that will make decisions and/or take actions in its environment to achieve a given goal
** – The job shop is a common scheduling problem in which multiple jobs are processed on several machines. Each job consists of a sequence of tasks, which must be performed in a given order, and each task must be processed on a specific machine.
This week, Daniel Cifuentes Daza, one of the Optimization Engineers here at Flexciton, explores the problem of reticle allocation in the photo area by reviewing a technical paper by Benzoni, A. et. al
The photolithography process is considered the most critical step in semiconductor wafer fabrication, where geometric shapes and patterns are reproduced onto a silicon wafer, ultimately creating the integrated circuits.
What makes this process unique is the use of additional resources, called reticles. The reticle is a photomask used to expose ultraviolet radiation that generates a specific pattern into the wafer. When not in use, reticles are stored in dedicated storage with a fixed capacity, called a stocker. Although the problem of allocating reticles in a stocker is not a “core” one in wafer production scheduling, if not optimized, it might significantly impact overall production efficiency by causing bottlenecks.
This week, Daniel Cifuentes Daza, one of the Optimization Engineers here at Flexciton, explores this problem by reviewing a technical paper by Benzoni, A. et. al. – “Allocating reticles in an automated stocker for semiconductor manufacturing facility” – and contrasting their approach with the one we use when scheduling at Flexciton.
A fab working with a wide variety of products may need several thousand reticles at any given time to fulfil their production requirements [2]. Not only must reticles be stored in a stocker, as explained above, but they often need to be transported in a container known as a pod in order to prevent contamination too. Therefore, the capacity and availability of stockers and pods within the fab makes deciding where each reticle should be stored at each step of the production schedule extremely complex – frequently causing bottlenecks [3].
To manage this process, fabs need to decide the best way to allocate reticles into pods, and then try to find an optimal assignment between pods and tools. However, there is another optimization problem at hand that complicates the process further; the position of reticles within the limited-capacity compartments of the stocker itself.
The time to retrieve a reticle from storage can be drastically different depending on its own location inside the stocker, thus leading to large inconsistencies in the so-called processing time of the stocker. As a result, the stocker can become a bottleneck by not dispensing reticles fast enough to meet wafer demand. Therefore, the reticle allocation problem also consists of choosing which reticles are to be stored in the low-capacity fast-retrieval compartment (“retpod”) vs the high-capacity slow-retrieval compartment.
In order to explore what might be the best way to address this problem, we have reviewed a tech paper published by IEEE for the WSC conference in 2020. The authors of the paper address the allocation issue using the famous knapsack problem. This approach will be evaluated in the next section of this article – with the pros and cons being discussed – before discussing how the proposed solution compares to how we model photolithography tools here at Flexciton.
In “Allocating reticles in an automated stocker for semiconductor manufacturing facility” by Benzoni, A. et. al. (2020) [5] the stockers examined by the authors have two compartments – one where reticles are stored using pods; retpod, and another where they do not use pods. The main objective is to allocate reticles into the retpod compartment, as this has faster retrieval times.
Additionally, the authors consider:
(1) the reticles, as the main resource of the problem,
(2) the steps where wafers will need to use the reticles in the near future, and
(3) the capacitated storage for reticles; the compartments of the stockers. Additionally, they see each reticle as having a profit value; the number of wafers processed in the batch. With this initial information, the problem can be modelled as the well-known knapsack problem.
Sheveleva, A. et. al (2021) [4] defines the knapsack problem as the following:
“There are k items with weight nk and value ck, and a knapsack with a capacity N. The problem is to fill the knapsack with items with the maximum total value, respecting the knapsack’s capacity limit”
In this case, each k item is a reticle, where its corresponding weight is always 1, and its value ck refers to the profit value. The knapsack is the retpod compartment of the stocker.
The knapsack problem is an NP-hard combinatorial problem that has been studied for many years within computer science, operations research, and other sciences. Therefore, due to its complexity, the authors decided to use a well-known heuristic. Here, the approach is to rank each reticle according to a specified objective value ratio and then fill the knapsack with the first N elements fulfilling its capacity.
The authors benchmarked three different objective functions for this heuristic as follows:
The three approaches reported an increase in the utilisation of the reticles from around 8% to 20%. This implementation also led to a reduction of processing times for the stockers of 1 hour. Strategies 1 and 2 showed the lowest error percentage, which is expected as Strategy 3 does not consider future steps where reticles are used.
Using the knapsack approach to solve this problem certainly has some positive points. Firstly, using a heuristic method is easy to implement and does not require much computational time, which also makes it scalable to industrial-sized problems. Secondly, it is trivial to work out why certain allocation decisions are taken, making it highly understandable. Lastly, the approach is flexible because the user can modify the objective function of the heuristic depending on the fab’s goals.
However, the issue of reticle allocation is just a small piece of the complex wafer manufacturing process. Since this approach is modelled as a standalone problem, it is creating feasible solutions for the reticle stocker alone without considering the state of the rest of the fab. This will likely lead to inconsistencies as the wafer schedule is intrinsically linked to the reticle allocation.
In addition, the approach described in the paper models a simplification of the photolithography area. There is relevant information missing, such as the availability of pods in the fab and their possible allocation to machines, transfer times, and load and unload times. The use of this information would give robustness to the approach.
At Flexciton, we consider that the best way to tackle the reticle allocation problem is to proactively generate not only feasible solutions, but optimized production schedules. In order to do this, we take into account all the scheduling constraints for reticles available within our optimization engine – using information such as:
The benefit of considering a multitude of information like this all in one optimization model means that we can provide a consistent and robust production schedule that takes into account all the constraints of reticles, pods and stockers. Additionally, our scheduler allows the user to configure their specific business objectives into the optimization process in order to meet their fab’s KPIs – the algorithm is then calculated and an optimized schedule is returned in a matter of minutes. All of this means that our technology is able to return a reliable, scalable and flexible solution that is tailored to our client’s needs – whilst optimizing the photolithography area in its entirety.
[1] Y. T. Lin, C. C. Hsu and S. Tseng, "A Semiconductor Photolithography Overlay Analysis System Using Image Processing Approach," Ninth IEEE International Symposium on Multimedia Workshops (ISMW 2007), 2007, pp. 63-69, doi: 10.1109/ISM.Workshops.2007.16.
[2] S. L. M. de Diaz, J. W. Fowler, M. E. Pfund, G. T. Mackulak and M. Hickie, "Evaluating the impacts of reticle requirements in semiconductor wafer fabrication," in IEEE Transactions on Semiconductor Manufacturing, vol. 18, no. 4, pp. 622-632, Nov. 2005, doi: 10.1109/TSM.2005.858502.
[3] You-Jin Park and Ha-Ran Hwang, "A rule-based simulation approach to scheduling problem in semiconductor photolithography process," 2013 8th International Conference on Intelligent Systems: Theories and Applications (SITA), 2013, pp. 1-4, doi: 10.1109/SITA.2013.6560788.
[4] A. M. Sheveleva and S. A. Belyaev, "Development of the Software for Solving the Knapsack Problem by Solving the Traveling Salesman Problem," 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus), 2021, pp. 652-656, doi: 10.1109/ElConRus51938.2021.9396448.
[5] A. Benzoni, C. Yugma, P. Bect and A. Planchais, "Allocating Reticles in an Automated Stocker for Semiconductor Manufacturing Facility," 2020 Winter Simulation Conference (WSC), 2020, pp. 1711-1717, doi: 10.1109/WSC48552.2020.9383933.
Since its inception, Flexciton has received over £21m in funding, with its recent Series A round raising a total of £15m. The Series A investment will be used for hiring across different areas of the team.
London, UK. 21 October 2021: Flexciton, an optimisation technology company that has developed a unique solution to radically improve the efficiency and productivity of fab manufacturing processes, has announced a £15M Series A investment, led by Nadav Rosenberg (Saras Capital). Flexciton’s solution analyses the real-time data that each fab generates and applies cutting edge technology to decide which actions need to be taken to optimise production.
Modern semiconductor fabs are the most complex manufacturing environments in the world, with the production process generating more scheduling options than there are atoms in the universe. A next-generation chip wafer might go through between 500-2,000 machine steps in a dynamic process. The end-to-end process of making a single chip could take from six months to a year to complete.
Jamie Potter, co-founder and CEO of Flexciton, said, “Automation is already used by many semiconductor manufacturers. However, even in advanced fabs where scheduling itself is automated, the software used to make these decisions tends to be based on predefined rules programmed by humans and determined by historical data. Yes, it can calculate different options far quicker than a human operator could, but the options are still 'best guesses' rather than optimal outcomes.”
Potter continued, “Flexciton is able to create an overview of how the entire fab is operating and rapidly sift through the trillions of options available, to come up with the optimal decision at that precise point in time. Using AI-powered mathematical algorithms and Mixed Integer Linear Programming, we can analyse real-time data – not historical – and make the best choices possible based on what is happening in a fab at a given moment. Our vision is to become the best in the world at running semiconductor fabs, before turning our attention to support other manufacturers. This investment plays a key part in achieving this, as we expand our team.”
The current chip shortage continues to make headlines worldwide, highlighting a semiconductor supply chain that is far from robust. Covid-19 may have impacted production, but it has shown how difficult it is for the industry to quickly adapt to surges in demand. 169 industries** were affected by the shortage, from automotive and consumer electronics to steel producers and concrete manufacturers. But agility is not the only issue - demand continues to grow and the industry needs more capacity. Globally, the industry is expected to be worth $803 billion by 2028.
Flexciton has been proven to achieve efficiency gains of 10 per cent. To a fab using 1,000 machines, this can save a factory tens of millions per year. There are currently over 1,000,000 machines worldwide waiting to be optimised - a number that continues to grow due to the worldwide demand for semiconductors.
Nadav Rosenberg, an investor who has supported the team since their early days and led the current Series A round, added, “Semiconductors are the fundamental building blocks of modern life. More than any other development of the scientific age, they have completely revolutionised the way that we work, play, communicate and learn. Demand will keep growing, and the industry must consider how to better utilise its current assets, before expanding and using further resources to meet demand. Efficiency is key. Flexciton is the first company to successfully apply this level of machine intelligence to real-world manufacturing, with hugely impressive results.”
Mike Chalfen, Chalfen Ventures, commented, “In a globally strategic industry, with billions in capex and opex spent on the world’s most complex manufacturing operations, Flexciton measurably saves enormous costs, fast. Its insight can change the very capacity, agility and economics of the semiconductor industry. Flexciton has the team, technology and ambition to be an enduring and important company.”
Flexciton was founded in 2016 by Jamie Potter and Dennis Xenos who have worked in the optimised manufacturing field for more than ten years, with a focus on how advanced mathematics can solve manufacturing scheduling issues. They quickly realised that it was impossible for humans to understand the trillions of options that manufacturing processes generated and that it was only through the application of mathematics that the unpredictability of such complex systems could be understood.
The team has since grown to 41, made of world-class experts, combining the disciplines of mathematics, semiconductor scheduling, AI, optimisation, data science and software development. Flexciton team members have published over five hundred academic papers, many of which are focused on optimisation technology. These papers and 10 years of academic research have become the foundation upon which Flexciton technology has been built.
Since its inception, Flexciton has received over £21m in funding, with its recent Series A round raising £15m. The Series A investment will be used for hiring across the team.
References
* – Semiconductor Market Size https://www.fortunebusinessinsights.com/semiconductor-market-102365
** – Impact of Semiconductor Shortage https://finance.yahoo.com/news/these-industries-are-hit-hardest-by-the-global-chip-shortage-122854251.html
Fabs usually approach scheduling in one of two ways; the heuristic approach, which is fast but not optimal and the mathematical approach, which is optimal but time-consuming. In order to attain optimal results that are able to keep up with changes on the factory floor – fabs should consider a hybrid approach.
In order to maintain high margins, the cost of manufacturing semiconductors needs to continually diminish. Previously, this had been achieved by increasing the wafer size and shrinking the size of the chips whilst increasing the density of transistors. However, as the effects of these tactics begin to culminate, the future of maximising a wafer fab’s capacity lies within optimizing operational processes.
The process of manufacturing a semiconductor chip is exceedingly complex, often requiring thousands of unique steps. Reducing cycle times and increasing throughput in such an intricate production process calls for very high production efficiency. Fabs usually approach this in one of two ways; the heuristic approach, which is fast but not optimal and the mathematical approach, which is optimal but time-consuming. In order to attain optimal results that are able to keep up with changes on the factory floor, however, fabs need to switch to advanced production scheduling.
The hybrid approach
At Flexciton, we are pioneering a new model that combines the two different methods with a hybrid technique. With this ground-breaking new model, which we call advanced mathematical hybrid optimization technology, optimal results are delivered in a matter of minutes.
Here are 5 reasons your fab will benefit from switching to an advanced scheduling solution:
The most complex of scheduling problems can be solved in less than 5 minutes, delivering near-optimal results. This makes the hybrid technique perfect for the dynamic fab environment, since updates can keep up to speed with changes on the factory floor.
Hybrid optimization is able to realise fully accurate schedules by accommodating all constraints. This ensures a true representation of all activity in the fab, as well as its limitations.
By employing mixed-integer linear programming (MILP), Flexciton’s hybrid method guarantees high-quality solutions. Thanks to performance enhancing decomposition, final solutions are very near to the global-optimal.
With MILP being the core of the solution, high performance can more easily be maintained with very little upkeep. With changes in objectives and recipes constantly taking place, not every consecutive shift in a fab is alike. Despite this, Flexciton’s solution can take into account all aims and constraints and consistently calculate an optimized schedule.
When needed, constraints and parameters can be altered without the need to rewrite or redesign extensive amounts of new code. Not only this, but hybrid-optimization scheduling can also be rolled out into the entirety of a fab (global scheduling) as well as multiple fabs, even when they have differing production characteristics.
Hybrid optimization could be the answer to your fab’s scheduling problems. Download our white paper to find out more.
The scarcity and fragility of reticles presents fab operators with a tradeoff that we have assessed by investigating three case studies where Flexciton's intelligent scheduler has been used to explore the different outcomes.
Photolithography processes are central to producing computer chips and semiconductor devices. However, they are typically considered to be bottlenecks due to their reliance on a critical secondary asset; reticles. Reticles are limited in number and yet are a critical piece of the coat-expose-develop loop. What is more, reticles are delicate in nature; they are enclosed in purpose-built cases for their transport in order to keep the potential of damage or distortion to a minimum.
As such, a fundamental tradeoff arises when operating photolithography toolsets: moving a reticle to the machine it is needed most (to carry out high-priority tasks) clashes with the requirement to be conservative with its transport. In theory, there are several compromises that the operator can make to reduce reticle movements - waiting a bit longer to ensure more wafers arrive to a machine and a larger batch can be processed with a single move is one example. However, in practice, identifying these strategic actions and balancing between the competing goals is highly complex. Flexciton can provide a solution to this issue by leveraging the power and flexibility of optimisation.
In this article, we show how the Flexciton’s scheduling engine can balance between minimising cycle times and reticle moves. Through a series of example case studies, we delve into the scheduling trade-offs that arise in the day-to-day operation of a semiconductor fab and how Flexciton’s solution can assist in uncovering schedules that optimally balance across competing goals.
The Flexciton scheduler can accommodate a range of user-defined objectives. The fab operator is typically interested in minimising KPIs such as cycle times but may also want to include other considerations, such as penalisation of labour-intensive decisions e.g. number of batches built. In this vein, we have recently introduced a new component; the number of reticle moves carried out. As shown in Figure 1, the user is able to define a penalty factor for reticle moves; the higher the value, the harder the engine will try to avoid moving reticles.
In the example case study we have 6 machines and 48 wafers to be scheduled, with a total of 4 reticles.
Reticle 10001 is required for all lots schedulable on machines 01, 02 and 03. Deciding on how to move this in-demand reticle across the 3 machines will impact both the number of moves as well as cycle times, particularly as there are some high-priority wafers waiting to be dispatched. Reticle 10001 is originally loaded in machine 01.
The other three reticles, 20001, 20002 and 20003 are initially loaded on machines 04, 05 and 06 and can be used by all three toolsets interchangeably. However, different machines are better suited to different reticles; for example in our case study, the same process is completed faster if using reticle 20001 on machine 06. Note that all machines have a maximum batch size of 4 wafers.
We start off by not penalising the number of reticle moves and solely minimising the total priority-weighted cycle times (TWCT) across all wafers. The optimal schedule produced by Flexciton’s engine is shown in Figure 2.
There are a total of 7 reticle moves, noted with red arrows in the figure below. 4 moves pertain to reticle 10001 which is moved from its initial location 01 to 03 and then 02 to carry out some high-priority wafers (as evidenced by the circled 1/2/3 next to the job names). The reticle is then moved out again to machine 01 and finally to machine 03 to carry out some lower priority jobs. Looking at machines 04, 05 and 06, the engine decides to immediately swap the reticles between the machines, to ensure that each lot is fed to its most suitable (in terms of processing times and capability) machine. The TWCT of all 48 wafer steps (where priority weights are user-defined and in this case range from 1 for highest-priority to 0.1 for lowest-priority wafers) is 16.79 hours.
In this second study, we have penalised reticle moves; the ratio for balancing TWCT and reticle moves has been set to 100:75 i.e. we will choose to avoid a reticle move only if its avoidance translates to an increase of TWCT of 0.75 hours or less. This is quite relaxed, but is aimed at avoiding reticle moves with little benefit, since the risk of potential damage is deemed higher. The optimal schedule obtained is shown in Figure 3.
In this study, there are a total of 5 reticle moves, noted with red arrows in the figure below. 3 moves pertain to reticle 10001 and its journey across machines 01, 02 and 03. The main difference to the previous scheduling pattern is that now we do not move the reticle back to machine 03 to carry out the very last batch of low priority wafers. Instead, we choose to wait for their arrival and carry them right after the high-priority batch finishes a bit after 19:00. This way we avoid that final reticle move, while also incurring a delay in the high-priority wafers scheduled on machine 02 which now have to be moved from 19:15 (in study 1) to 19:30.
Looking at machines 04, 05 and 06, the engine decides to immediately swap only two the three reticles this time, and leave reticle 20001 on its initial machine. Although that initial setup is not ideal in terms of processing times it does prevent the reticle move deemed to be “lower value”. The TWCT of all 48 wafer steps is 18.73 hours.
In this study we look at the extreme case of using a very high penalty on reticle moves, hence allowing only absolutely necessary reticle moves. In particular, we have opted to use a TWCT to reticle move cost ratio of 1:10. In such cases, the operator is willing to accept sub-optimal job-machine allocation decisions, as well as delayed scheduling of high-priority wafers, for the purpose of keeping reticle movement to the absolute minimum. The optimal schedule obtained is shown in Figure 4.
In this study, the total number of reticle moves has come down to just 2 moves, noted with red arrows in the figure below. Both moves pertain to reticle 10001 and its journey across machines 01, 02 and 03 to ensure all wafers are completed. In the case of machines 04, 05 and 06, we are still able to carry out all tasks, albeit with longer processing times, as evidenced in the much later finishing times of the machines. The TWCT of all 48 wafer steps is 23.20 hours.
Plotting the aforementioned runs (and also some more data points), we obtain Figure 5, which clearly illustrates the trade-off at play here. As we traverse the penalty factor from a low to a high value, the number of reticle moves drops and the cycle times increase. As expected, these relationships are monotonic but not smooth, since they depend on discrete events. Note also that both curves are bounded both from above and below, corresponding to the absolute minimum number of reticle moves required (in this case 2) and the absolute maximum number of reticle moves that is optimal (in this case 7).
By running a few scenarios with different parameters, the Flexciton engine opens up the possibility to explore the tradeoff frontier in detail, enabling operators to quantify how KPIs would change with a more relaxed or constrained attitude towards reticle movements.
In practice, enabling the Flexciton scheduling engine to consider reticle moves is a computationally challenging task, involving novel development in the model’s MILP formulations and heuristics. Nevertheless, this feature has been accommodated with no deterioration to performance and schedule quality. The Flexciton engine is capable of scheduling thousands of wafers across hundreds of machines in a few minutes while also controlling for the operator’s tolerance to reticle movement.
Indicatively, we showcase results obtained from scheduling a real-world fab plant. At the time of the study, the plant had a total of 3,478 wafers to be scheduled on 209 toolsets (with a total of 358 load ports). We computed two schedules: one with low and one with high penalisation of reticle moves. These scheduling runs were computed in roughly the same time: respectively, confirming that despite the added complexity, this feature can scale well and provide a schedule in a few minutes.
Focusing on the reticle machines, the results obtained suggested that reticle movements could be reduced by around 26% while leading to an increase in total cycle times of around 2%. Note that these results are priority-weighted, with further analysis revealing that high-priority wafers are not substantially impacted; the optimiser is able to identify “low-value” reticle movements relating to e.g. early processing of a low priority wafer and either avoid that movement by using an alternative recipe, or deferring that movement to later when a low-priority wafer can be combined with a high-priority wafer in a batch.
Reticle scheduling is a very important consideration in the scheduling of advanced semiconductor fabrication plants. This resource, already highly constrained, comes with a critical consideration in practice: frequent movements and manual handling of the delicate reticles increase the risk of damage or distortion during transport. As such, the number of times a reticle is moved to a new machine must be managed conservatively. This inadvertently clashes with the operator’s fundamental objective of reducing cycle times.
Flexciton has extended the capabilities of our Mixed Integer Linear Programming (MILP) scheduling engine to natively accommodate the modelling and penalisation of reticle movements. This allows the user to define their own risk profile, so as to limit reticle movements solely to cases deemed of high value. In addition, the engine opens up the possibility to explore this tradeoff frontier in detail, enabling operators to quantify how their plant’s performance may change with a more relaxed or constrained attitude towards reticle movements.
Ioannis Konstantelos is a Principal Optimisation Engineer at Flexciton. He holds a PhD from Imperial College London and has published over 50 conference and journal papers on optimization and artificial intelligence methods. Ioannis joined Flexciton over 3 years ago and is involved in the development of Flexciton’s scheduling engine.
Charles Thomas is a Test Analyst with a background in Mechanical Engineering and a Masters degree from the University of Southampton. He has been at Flexciton for 2 years and leads the benchmarking and testing of the application with a particular focus on scheduling engine performance.