Why Optimized Scheduling is the Answer to Balancing Reticle Moves and Cycle Time

Introduction
Photolithography processes are central to producing computer chips and semiconductor devices. However, they are typically considered to be bottlenecks due to their reliance on a critical secondary asset; reticles. Reticles are limited in number and yet are a critical piece of the coat-expose-develop loop. What is more, reticles are delicate in nature; they are enclosed in purpose-built cases for their transport in order to keep the potential of damage or distortion to a minimum.
As such, a fundamental tradeoff arises when operating photolithography toolsets: moving a reticle to the machine it is needed most (to carry out high-priority tasks) clashes with the requirement to be conservative with its transport. In theory, there are several compromises that the operator can make to reduce reticle movements - waiting a bit longer to ensure more wafers arrive to a machine and a larger batch can be processed with a single move is one example. However, in practice, identifying these strategic actions and balancing between the competing goals is highly complex. Flexciton can provide a solution to this issue by leveraging the power and flexibility of optimisation.
In this article, we show how the Flexciton’s scheduling engine can balance between minimising cycle times and reticle moves. Through a series of example case studies, we delve into the scheduling trade-offs that arise in the day-to-day operation of a semiconductor fab and how Flexciton’s solution can assist in uncovering schedules that optimally balance across competing goals.
Trade-off frontier with the Flexciton scheduler
The Flexciton scheduler can accommodate a range of user-defined objectives. The fab operator is typically interested in minimising KPIs such as cycle times but may also want to include other considerations, such as penalisation of labour-intensive decisions e.g. number of batches built. In this vein, we have recently introduced a new component; the number of reticle moves carried out. As shown in Figure 1, the user is able to define a penalty factor for reticle moves; the higher the value, the harder the engine will try to avoid moving reticles.

In the example case study we have 6 machines and 48 wafers to be scheduled, with a total of 4 reticles.
Reticle 10001 is required for all lots schedulable on machines 01, 02 and 03. Deciding on how to move this in-demand reticle across the 3 machines will impact both the number of moves as well as cycle times, particularly as there are some high-priority wafers waiting to be dispatched. Reticle 10001 is originally loaded in machine 01.
The other three reticles, 20001, 20002 and 20003 are initially loaded on machines 04, 05 and 06 and can be used by all three toolsets interchangeably. However, different machines are better suited to different reticles; for example in our case study, the same process is completed faster if using reticle 20001 on machine 06. Note that all machines have a maximum batch size of 4 wafers.
Case Study 1: Focusing solely on cycle times
We start off by not penalising the number of reticle moves and solely minimising the total priority-weighted cycle times (TWCT) across all wafers. The optimal schedule produced by Flexciton’s engine is shown in Figure 2.
There are a total of 7 reticle moves, noted with red arrows in the figure below. 4 moves pertain to reticle 10001 which is moved from its initial location 01 to 03 and then 02 to carry out some high-priority wafers (as evidenced by the circled 1/2/3 next to the job names). The reticle is then moved out again to machine 01 and finally to machine 03 to carry out some lower priority jobs. Looking at machines 04, 05 and 06, the engine decides to immediately swap the reticles between the machines, to ensure that each lot is fed to its most suitable (in terms of processing times and capability) machine. The TWCT of all 48 wafer steps (where priority weights are user-defined and in this case range from 1 for highest-priority to 0.1 for lowest-priority wafers) is 16.79 hours.

Case Study 2: Moderate penalisation of reticle moves
In this second study, we have penalised reticle moves; the ratio for balancing TWCT and reticle moves has been set to 100:75 i.e. we will choose to avoid a reticle move only if its avoidance translates to an increase of TWCT of 0.75 hours or less. This is quite relaxed, but is aimed at avoiding reticle moves with little benefit, since the risk of potential damage is deemed higher. The optimal schedule obtained is shown in Figure 3.
In this study, there are a total of 5 reticle moves, noted with red arrows in the figure below. 3 moves pertain to reticle 10001 and its journey across machines 01, 02 and 03. The main difference to the previous scheduling pattern is that now we do not move the reticle back to machine 03 to carry out the very last batch of low priority wafers. Instead, we choose to wait for their arrival and carry them right after the high-priority batch finishes a bit after 19:00. This way we avoid that final reticle move, while also incurring a delay in the high-priority wafers scheduled on machine 02 which now have to be moved from 19:15 (in study 1) to 19:30.
Looking at machines 04, 05 and 06, the engine decides to immediately swap only two the three reticles this time, and leave reticle 20001 on its initial machine. Although that initial setup is not ideal in terms of processing times it does prevent the reticle move deemed to be “lower value”. The TWCT of all 48 wafer steps is 18.73 hours.

Case Study 3: Only necessary reticle moves
In this study we look at the extreme case of using a very high penalty on reticle moves, hence allowing only absolutely necessary reticle moves. In particular, we have opted to use a TWCT to reticle move cost ratio of 1:10. In such cases, the operator is willing to accept sub-optimal job-machine allocation decisions, as well as delayed scheduling of high-priority wafers, for the purpose of keeping reticle movement to the absolute minimum. The optimal schedule obtained is shown in Figure 4.
In this study, the total number of reticle moves has come down to just 2 moves, noted with red arrows in the figure below. Both moves pertain to reticle 10001 and its journey across machines 01, 02 and 03 to ensure all wafers are completed. In the case of machines 04, 05 and 06, we are still able to carry out all tasks, albeit with longer processing times, as evidenced in the much later finishing times of the machines. The TWCT of all 48 wafer steps is 23.20 hours.

Exploring the trade-off frontier
Plotting the aforementioned runs (and also some more data points), we obtain Figure 5, which clearly illustrates the trade-off at play here. As we traverse the penalty factor from a low to a high value, the number of reticle moves drops and the cycle times increase. As expected, these relationships are monotonic but not smooth, since they depend on discrete events. Note also that both curves are bounded both from above and below, corresponding to the absolute minimum number of reticle moves required (in this case 2) and the absolute maximum number of reticle moves that is optimal (in this case 7).
By running a few scenarios with different parameters, the Flexciton engine opens up the possibility to explore the tradeoff frontier in detail, enabling operators to quantify how KPIs would change with a more relaxed or constrained attitude towards reticle movements.

Performance in real-world applications
In practice, enabling the Flexciton scheduling engine to consider reticle moves is a computationally challenging task, involving novel development in the model’s MILP formulations and heuristics. Nevertheless, this feature has been accommodated with no deterioration to performance and schedule quality. The Flexciton engine is capable of scheduling thousands of wafers across hundreds of machines in a few minutes while also controlling for the operator’s tolerance to reticle movement.
Indicatively, we showcase results obtained from scheduling a real-world fab plant. At the time of the study, the plant had a total of 3,478 wafers to be scheduled on 209 toolsets (with a total of 358 load ports). We computed two schedules: one with low and one with high penalisation of reticle moves. These scheduling runs were computed in roughly the same time: respectively, confirming that despite the added complexity, this feature can scale well and provide a schedule in a few minutes.
Focusing on the reticle machines, the results obtained suggested that reticle movements could be reduced by around 26% while leading to an increase in total cycle times of around 2%. Note that these results are priority-weighted, with further analysis revealing that high-priority wafers are not substantially impacted; the optimiser is able to identify “low-value” reticle movements relating to e.g. early processing of a low priority wafer and either avoid that movement by using an alternative recipe, or deferring that movement to later when a low-priority wafer can be combined with a high-priority wafer in a batch.
Conclusions
Reticle scheduling is a very important consideration in the scheduling of advanced semiconductor fabrication plants. This resource, already highly constrained, comes with a critical consideration in practice: frequent movements and manual handling of the delicate reticles increase the risk of damage or distortion during transport. As such, the number of times a reticle is moved to a new machine must be managed conservatively. This inadvertently clashes with the operator’s fundamental objective of reducing cycle times.
Flexciton has extended the capabilities of our Mixed Integer Linear Programming (MILP) scheduling engine to natively accommodate the modelling and penalisation of reticle movements. This allows the user to define their own risk profile, so as to limit reticle movements solely to cases deemed of high value. In addition, the engine opens up the possibility to explore this tradeoff frontier in detail, enabling operators to quantify how their plant’s performance may change with a more relaxed or constrained attitude towards reticle movements.
Authors
Ioannis Konstantelos is a Principal Optimisation Engineer at Flexciton. He holds a PhD from Imperial College London and has published over 50 conference and journal papers on optimization and artificial intelligence methods. Ioannis joined Flexciton over 3 years ago and is involved in the development of Flexciton’s scheduling engine.
Charles Thomas is a Test Analyst with a background in Mechanical Engineering and a Masters degree from the University of Southampton. He has been at Flexciton for 2 years and leads the benchmarking and testing of the application with a particular focus on scheduling engine performance.
Useful resources
Stay up to date with our latest publications.