We upload to our blog every couple of weeks, sharing insightful articles from our engineers as well as company news an our opinions on recent industry topics. Subscribe to our mailing list to get great content delivered straight to your inbox.
Jamie Potter, CEO & Co-founder of Flexciton and Tina O'Donnell, Systems Engineering Manager from Seagate discussed advanced scheduling technology and its impact on wafer fab production performance.
Wafer fabrication is not only a highly complex manufacturing process but also capital intensive. With the cost of a single new 300mm fab now exceeding $1bn and with some tools costing in excess of $40m, fixed costs are significant and demand high fab utilization.
In an environment where capacity and production efficiency is key to continued cost reductions, the benefit of high-quality scheduling is huge to factory efficiency. It enables higher utilization of expensive toolsets (e.g., photolithography and etch), reduces cycle times and ensures on-time delivery.
We were thrilled to present with Seagate Technology once more. Jamie Potter, CEO & Co-founder of Flexciton and Tina O'Donnell, Systems Engineering Manager from Seagate discussed advanced scheduling technology and its impact on wafer fab production performance.
The webinar was hosted by SEMI Europe and moderated by TechWorks NMI.
June 15th, 5pm CEST / 4pm BST.
Watch this case study webinar to learn how Seagate is successfully using smart scheduling technology to optimize fab efficiency.
This webinar has now been removed from our website. To get exclusive access to the webinar, please get in touch by clicking here.
Reviewing technology literature is a common practice when developing a new approach to solving an existing problem. James Adamson, a Senior Optimization Engineer at Flexciton, has recently reviewed several technical papers on photolithography scheduling, one of which he found particularly interesting.
Reviewing technology literature is a common practice when developing a new approach to solving an existing problem. James Adamson, a Senior Optimization Engineer at Flexciton, has recently reviewed several technical papers on photolithography scheduling, one of which he found particularly interesting.
Photolithography is the cutting edge of semiconductor manufacturing and as a result, requires the most complex and expensive equipment to run and maintain. Reticles (also known as photomasks), must be prepared and loaded into litho tools before the wafers can process. These fragile masks are extremely expensive (in the region of $100k or more [Weber 2006]), making them a scarce resource.
Wafers require specific reticles for their individual process step. Therefore fab operators need to ensure that the correct reticles are at the correct tools on time in order to keep production KPIs such as wafer throughput and cycle time optimal. While reticles can be moved between tools, this takes time and considering how fragile these masks are, the movement needs to be minimised as much as possible. If wafer scheduling wasn’t already difficult enough, we now have to wrestle with reticle scheduling too.
The paper “A Practical Two-Phase Approach to Scheduling of Photolithography Production” by Andy Ham and Myesonig Cho was published in 2015. The authors present an approach that is based on the observation that most semiconductor manufacturing companies are still using real-time dispatching (RTD) systems to make last-second dispatching decisions in the fab. RTD has the advantage of being familiar and relatively understandable whilst also being fast computationally. In contrast, some optimization-based approaches, particularly for photolithography, can struggle to scale up to industrial-scale problems. The authors' approach exploit the idea that an exact schedule for the next several hours is not necessarily needed and that RTD will ultimately be responsible for the final dispatch decision.
They propose a two-stage approach that integrates a simple heuristic (designed to mimic a fab’s RTD system) with mixed-integer programming (MIP):
The two stages are then tied together in an iterative fashion. A set number of lots are scheduled in each iteration of the two stages. The algorithm then repeats from Stage 1 with new additional lots and keeps iterating until all lots are scheduled.
A MIP approach is proposed to solve the assignment problem in the first stage. Two primary decision variables are used:
The model does not account for the explicit timing of lots on their allocated machines. Therefore it cannot prescribe a sequence of lots or reticles on the machines. It just indicates that they will be scheduled on this machine at some point. The model requires that all lots are assigned to a machine and a reticle. Finally, the model measures the completion time of each machine as a function of the processing time of all lots allocated to the machine, rather than explicitly deciding the order of each lot on the machine.
Multiple objectives are used to achieve the trade-off between reticle movements, cycle times, and machine load balancing:
Sequencing decisions are handled by an RTD system, where the manufacturer’s custom business rules can be applied; however, the lot-reticle-machine assignment decisions are fixed. This reduces the scope of the decision-making that RTD must make. The authors highlight the benefit of explainability with this two-stage approach. When questioning assignment decisions, the assignment model should be explored, whereas when questioning sequencing decisions, RTD should be investigated.
This practical approach was shown to solve reasonable problem sizes (500 lots, up to 800 reticles, 30 machines) in 2-4 minutes. They managed to reduce cycle times by 3%, on average, and, particularly interestingly, reticle movements by up to 40% when compared to standalone RTD.
Although the model does have some shortcomings, as outlined above, the practicality of the approach makes it a strong candidate for production-size scheduling as very few studies have been able to effectively handle industrial problem sizes for photolithography tools. The reduction in reticle movements achieved, in particular, cannot be ignored.
However, with the notion of time largely ignored in the assignment model, the approach outlined is certainly simplistic.
There are a number of factors not considered in the model, including:
At Flexciton, we schedule a variety of photolithography tools as part of our optimization engine. Our hybrid optimization-based solution strategy is therefore capable of handling all the intricacies of a wafer fab simultaneously, such as the issues described in the previous section;
Not only do we model these complexities, but we also succeed in achieving high-quality schedules in little computation time.
The user is given the option of controlling various relative priorities of the lots, in addition to deciding the relative importance of KPIs such as reticle movements vs lots’ cycle time. The flexibility of an optimization approach that considers all of the advanced photolithography constraints combined with a self-tuning model that has limited tuning parameters is what makes our engine highly attractive as a semiconductor scheduler.
1) Weber, C.M; Berglund, C.N.; Gabella, P. (13 November 2006). "Mask Cost and Profitability in Photomask Manufacturing: An Empirical Analysis". IEEE Transactions on Semiconductor Manufacturing. 19 (4). doi: 10.1109/TSM.2006.883577
In a highly complex wafer fabrication environment, even the most advanced fabs struggle with scheduling time constraints. Begun Efeoglu Sanli, one of our Optimization Engineers, reviews a recently published technical paper on this particular subject.
Time constraints (also known as timelinks) between consecutive process steps are designed to eliminate queueing time at subsequent steps. In a highly complex wafer fabrication environment, even the most advanced fabs struggle with scheduling time constraints. While our engineering team works on applying Flexciton technology to solve the timelinks problem, Begun Efeoglu Sanli, one of our Optimization Engineers, reviews a recently published technical paper on this particular subject.
A silicon wafer undergoes a fabrication process by entering multiple production steps, where each step is performed by different, highly sophisticated tools. Optimizing the transition and waiting time of the lots has a huge impact not only on a fab production performance but also its profitability. As an example, by introducing time constraints at the wet etch and furnace process steps, we prevent the likelihood of oxidation and contamination. Failing to do so, risks contact failures, low and unstable yields, the consequence of which is either rework or the wafers must be scrapped. Such problems are difficult to discover during wafer processing, and to run special monitoring lots would be a considerable effort.
Yield optimization has long been considered to be one of the key goals, yet difficult to achieve in semiconductor wafer fab operations. As the semiconductor manufacturing industry becomes more competitive, effective yield management is a determining factor to deal with increasing cost pressures. Time links between consecutive process steps are one of the most difficult constraints to schedule, with a significant impact on yield management.
Some factories avoid the problem by dedicating tools to each process group that requires a previous cleaning or etch step. This strategy's obvious disadvantage is the higher demand from wet tools, which leads to higher investment, more cleanroom space, and ultimately to lower capital efficiency. The tradeoff between increasing throughput and a higher likelihood of violating lots’ time constraints is an everyday battle for fab managers trying to meet yield targets.
An example of time constraints for a single lot is illustrated in Figure 1 below. It shows a time link system between four consecutive process steps. In this example, we can see that the lot has time links constraining Step 2 to Step 4 as well as from Step 3 to Step 4, with overlapping time lag phases (also known as a nested time link constraint). This means that after completing process Step 3, the lot begins a new time lag phase (Time Link 3) whilst already transitioning through an existing time lag (Time Link 2) started upon completion of Step 2. As you might expect, the need to simultaneously look ahead and consider future decisions whilst also being constrained by past decisions is not trivial to model well in a heuristic or as real-time dispatch rules.
Time constraints are already difficult to navigate, but nesting them adds yet another layer of complexity for heuristics to wrestle with. In the example below, if the final step cannot be brought forward, scheduling Step 3 too close to Step 2 may make it impossible to meet the "Time Link 3". It is because the time between Steps 3 and 4 is now greater than the maximum allowed. This would not be a problem if the time constraints were not nested and we only have to schedule according to the "Time Link 2".
Surprisingly, although time constraints are an important topic, they have not been widely discussed in the tech literature so far. That said, an interesting paper on this topic was presented at the Winter Simulation Conference 2012 in December, by A. Klemmt and L. Mönch, “Scheduling Jobs with Time Constraints between Consecutive Process Steps in Semiconductor Manufacturing” of Infineon Technologies and University of Hagen.
The authors propose mixed-integer programming (MIP) model formulation and share some preliminary experimentation. Unfortunately, even state of the art MIP solvers can only solve problem instances up to 15 jobs and 15 machines to optimality in a reasonable amount of time.
Consequently, the authors develop two alternative approaches:
This novel decomposition approach allows for solving considerable problem instances including more than 100 machines, more than 20 steps, nested time constraints, and a large number of jobs.
The paper referenced above highlights that both approaches can provide good feasible schedules quickly and as expected, the MIP-based heuristic outperforms the simple heuristic. Nevertheless, as with most heuristic approaches, there are some important tuning parameters that might affect the schedule quality. In the paper, the lots are sorted with respect to their due dates to build subproblems. This approach could perhaps be reevaluated if cycle time is the most important KPI for a fab or if some jobs are of higher priority. Similarly, if time constraint violations are allowed to some degree, then one could relax the importance of this in the heuristic.
The most important consequence, which is also mentioned, is that the cycle time of time-constrained processes correlates highly with utilization of upstream (start of the time link) processes. This eliminates waiting time in front of the upstream tool and enables higher utilization. However, if downstream tools are bottlenecks, then WIP may have to be withheld such that time constraints are not violated by stagnating in front of busy tools.
Another tradeoff to be considered is how low-priority time-constrained steps are scheduled among high-priority non-time-constrained steps. For example, is it worth risking a time constraint violation for the sake of rushing an urgent lot through the toolset? This needs to be quantified and considered by the fab manager. Therefore, all these tradeoffs should be taken into account in order to provide the best schedule.
At Flexciton, we include time constraints as part of our optimization engine that attempts to eliminate all violations of time links as the highest priority. Only as a last resort are time constraints relaxed if it is not possible to provide an otherwise “feasible” schedule. This could occur if the time windows provided are unrealistically short considering all operational constraints, especially tool capacities.
As mentioned in the previous publications, the Flexciton optimization engine is a multi-objective solution that can balance various KPIs according to user-chosen weights, one of which controls the degree to which violations of time constraints are penalized. The main advantage of this approach is that, with all the other competing objectives, our solution can balance throughput, cycle time and priority-weighted time constraint violations simultaneously.
The tools used in a fabrication process are extremely sophisticated; requiring an extensive preventive maintenance regime to ensure reliable production. A big challenge faced by fab managers is getting in place optimal scheduling of preventative maintenance whilst still meeting their production KPIs.
In the constant pursuit of improved efficiency in semiconductor wafer fabs, the reliability of equipment is essential. The tools used in a fabrication process are extremely sophisticated; requiring an extensive preventive maintenance regime to ensure reliable production. A big challenge faced by fab managers is getting in place optimal scheduling of preventative maintenance whilst still meeting their production KPIs. Simply because such scheduling is extremely complex, involving many trade-offs as well as being time-consuming. What’s more, the exact impact on productive output is difficult to quantify.
Typically, to handle this complex problem, a fab may develop statistical models that try to predict unexpected tool downs. Such preventive maintenance – based on a pre-determined frequency – can help to minimise unexpected disruptions.
However, determining optimal maintenance frequencies is not an easy task, requiring answers to numerous questions and trade-offs that impact the eventual ability of a fab to meet its KPIs. Such questions include:
But what happens if it was possible to use your KPIs as basis to optimize maintenance scheduling? Instead of using a simple rule-based predictive model, such scheduling weighs constraints and finds the optimal schedule that will enable you to meet your KPIs.
Flexciton's scheduling technology addresses all such questions by finding the optimal schedule for your fab in any variety of forecasted conditions.
A ‘what-if’ scenario capability allows fab managers to effortlessly trial new preventative maintenance plans based on a variety of trade-offs or constraints. In addition, rather than dictate the time that tools must be taken offline, our optimizer will ensure all KPIs are achieved as best as possible, given the constraints.
By doing so, it prescribes the optimal maintenance schedule for the factory. All the fab manager has to do is to decide on suitable windows of time for each of the tools to be taken down.
Let’s see what happens in three scenarios where we apply Flexciton’s maintenance scheduling capabilities with varying degrees of scheduling complexity. The scenarios are structured as follows:
The Gantt chart below (Figure 1) shows a snapshot of 300 lots scheduled in small toolsets over the course of twelve hours. Each lot can only go to a certain number of tools within that toolset, where the toolset is identifiable by the tool’s prefix. Each lot is assigned a priority. We optimize for the total cycle time of the lots, weighted by their priority. The maintenance periods (shown in striped orange) are of varying duration and are randomly assigned to tools to take place at a specific fixed time somewhere in the twelve-hour schedule.
In Case 1, we compare the logic of scheduling, given these fixed maintenance timings, with a heuristic dispatcher against Flexciton.
Here we can see that on the ‘XZMW/097’ tool, the dispatch system struggled to ‘look ahead’ and dispatch effectively, when given obstacles such as upcoming downtime just after 02:00. It would be better to, in the meantime, dispatch a short processing lot. An even more ideal schedule can flexibly move downtime around to maintain consistent, predictable throughput across the schedule.
So, what if the scheduler is allowed to prescribe the timings that it finds optimal? The following Gantt chart is from Case 3, where the optimizer is free to plan the maintenance at any time within a 90-minute window.
To get a quick understanding of the results achieved using the three scenarios, we use queue time as our KPI. In the table below, you can see that the flexible maintenance approach greatly outperforms a simple dispatch heuristic. Obviously, queue time is a one-dimensional constraint of which there are infinitely more in a fab process that need to be considered. It is here that our maintenance optimization solution offers fabs unique capabilities: weighing all possible constraints to ensure KPIs are met to the fullest.
So instead of letting predictive maintenance schedules drive production, why not let the driver of maintenance planning be your fab's top-line production KPIs? The Flexciton optimizer allows easy scenario testing and exploration in order to effectively quantify the impact that maintenance has on the production schedule.
Flexciton’s solution enables fab managers to consider multidimensional trade-offs simultaneously. The alternative to such informed decision-making is that fabs schedule their maintenance ‘blind’. They will ultimately pay the price through unpredictable cycle time, unsatisfactory throughput and unnecessary tool downtime. By switching over to smart scheduling, it is much easier to get an accurate prediction of the impact that modifying a downtime schedule will have in terms of meeting top-level KPIs. Learn more about smart scheduling by downloading our white paper, "Superior Scheduling: hybrid approach boosts margin"
A typical approach is to plan maintenance activities ahead of time using simple rules-based models, where the maintenance is run on a particular day, at a particular time. The consequence of such approach, however, is optimising maintenance timing at the expense of production KPIs such as cycle time and throughput.
Preventive maintenance is a common practice in semiconductor wafer fabrication and essential for overall equipment availability and reliability. A typical approach is to plan maintenance activities ahead of time using simple rules-based models, where the maintenance is run on a particular day, at a particular time. The consequence of such approach, however, is optimising maintenance timing at the expense of production KPIs such as cycle time and throughput. What if we consider it the other way around and treat these KPIs as priority in the objective?
The complexity of semiconductor wafer fabrication entails a huge number of decisions and trade-offs a fab manager has to deal with each day. Preventive maintenance is one of them. The equipment used in the fabrication process is extremely capital intensive; therefore, it is critical that tools are utilised effectively and maintained on a regular basis to avoid any failures. Any servicing requires stopping the tools and suspending it from the manufacturing processes for a given period of time. With a recent shortage of chips, for the automotive industry, in particular, a fab manager faces a significant challenge - how to schedule preventive maintenance operations whilst ensuring maximum OTD and high throughput?
Maintenance scheduling is an established topic of research, with many authors showcasing various ways of solving this scheduling problem using simulation and optimisation techniques. An interesting tech paper on this topic was presented at the Winter Simulation Conference 2020 in December, by A. Moritz et al. “Maintenance with production planning constraints in semiconductor manufacturing” of Mines Saint-Étienne and STMicroelectronics. [the paper]
In this article, Ioannis Konstantelos, our Optimization Technology Lead, reviews the paper and explains Flexciton's approach to this complex topic.
The authors focus on identifying the best possible period of time (e.g. a day), across a large time range, in which to carry out maintenance tasks, while “respecting production deadlines and the capacity constraints on tools”. Two mathematical models are presented; in model 1, the maintenance is seen as a task that must be performed in a single period of time, e.g. one day (24h), while model 2 allows maintenance to be distributed across two consecutive periods, e.g. two days (48h).
Both models treat production schedules as fixed i.e. the lot to tools assignment and timings for production purposes have been decided a priori. As such, the proposed formulation is a discrete-time model1, allowing to perform maintenance only within defined points in time. The model uses the following decision-making variables:
There is a limit on the total time allocated to each of production and maintenance tasks.
The model's objective function is a combination of maximising the number of maintenance tasks that can be performed within the time horizon and the earliness of these tasks. A user defines parameters to tune the importance of each of these aspects.
The two-step approach showcased in the paper leaves a core question unanswered; the trade off between production and maintenance. Typically, production scheduling aims to optimise a particular KPI, such as cycle time or throughput. By treating the production schedule as fixed and optimising the number and earliness of maintenance operations around it, we are ignoring the trade-off to the KPIs that matter and there may well be foregone synergy opportunities.
The paper rightly highlights the need to consider maintenance using a formal mathematical framework. Nevertheless, there are some assumptions that limit the applicability and benefit of the proposed approach.
Want to know more about the different wafer fab scheduling approaches including heuristics, mathematical and hybrid approaches? Read the following whitepaper, here we cover everything about wafer fab scheduling approaches
Results are presented for 12 real-world case studies, involving around 100 maintenance tasks to be scheduled for 14 tools families over a span of 60 one-hour periods. The more relaxed model (model 2) is shown to perform better, both in terms of the number of maintenance tasks planned as well as total earliness. One strength of the proposed approach is the speed of computation. As the authors state, the proposed model can be the basis for iterative discussions between production and maintenance planners.
Ideally, production and maintenance scheduling should be tackled in a single model, where the objective function according to cycle time and throughput applies. This can be achieved by treating maintenance as tasks that need to be scheduled within a specific window. Thereby fab managers can explicitly consider the impact that maintenance tasks have on the schedule such that impact on the production KPIs is minimised. Of course, such integrated approaches result in a substantial increase of problem size and complexity, necessitating the development of solution strategies capable of handling the ensuing complexity. Especially in cases of a large number of maintenance tasks or lengthy maintenance, such constraints can quickly render a problem intractable.
At Flexciton, we have developed smart scheduling solution that involves decomposition techniques to manage the added complexity introduced by maintenance constraints. The users can describe their maintenance tasks as “optional” or “must-run” as well as having a fixed start time or a flexible time window within which they can be carried out.
The Flexciton engine proceeds to optimise the target production KPIs while respecting maintenance constraints. The resulting production schedule prescribes the best time to carry out maintenance while capturing all individual tools characteristics and respecting all operational constraints so as to achieve the best use of available assets. Learn more about the technology behind Flexciton's smart scheduling solution
* Discrete time and continuous time are two alternative frameworks within which to model variables that evolve over time. Discrete time views values of variables as occurring at distinct, separate "points in time". In contrast, continuous time views variables as having a particular value for potentially only an infinitesimally short amount of time. Between any two points in time there are an infinite number of other points in time
Insightful experiments expose the weakness of limiting the number of recipes enabled on a tool. The key findings are that this limitation can lead to an increase in fab cycle times by more than 40 percent.
Insightful experiments expose the weakness of limiting the number of recipes enabled on a tool. The key findings are that this limitation can lead to an increase in fab cycle times by more than 40 percent.
It’s not easy managing a fab. While the goals are simple – maximising the yield and throughput – execution is really hard. That’s partly because fabs churn out a range of products, produced within many tools and processes; and partly because the goalposts constantly shift, due to the dynamic nature of the environment.
Sometimes the need for change reflects success in the business. After winning a new order, those running a fab may need to develop and run new recipes before they can manufacture these latest products. Unfortunately, this requires manpower to implement, it could dictate the need for more regular maintenance of a tool, and evaluating KPIs could prove tricky.
At other times those that run a fab will have no warning of the need for change, and will be forced to make tough decisions at breakneck speed. If a tool suddenly fails to process material within spec it has to be taken off-line and assessed. Meanwhile compromised wafers are etched back and processed via a different route through the fab, potentially involving alternative tools running new recipes.
To simplify operations within a fab, many managers restrict some tools to processing only particular products. This is accomplished by limiting the number of recipes on selected tools. It’s a tempting option that might reduce tool maintenance, but it is not without risk. While the hope is that this course of action has negligible impact on the throughput of the fab, there is a danger that it could make a massive dent in the bottom line.
Up until now, fab managers have taken an educated guess at what the implications might be. But they would clearly prefer a more rigorous approach - and thankfully that is now within their grasp, due to the recent launch of our Optimization Scheduler.
To illustrate some of the powerful insights that can be garnered with our software, we have considered the consequences of restricting the use of tools within a hypothetical fab. Our key findings are that this can lead to a massive hike in waiting times at particular tools, and ultimately increase fab cycle times by more than 40 percent.
We reached these conclusions after performing a pair of complementary case studies. The first, considering a single randomly generated dataset, allowed us to take a deep dive into the frequency of use of particular tools and their corresponding wait times. The second, involving twenty randomly-generated datasets, allowed us to evaluate the impact of restricting the use of tools on the cycle times for the fab.
In the fab that we modelled there were six toolsets, each with four tools. All ran until the fab had carried out what we describe as 1000 work units – that is, the fab operated until it clocked up 1,000 steps across all lots through the modelled tools (one lot has one work unit for every step completed along its route).
For our first case study, which considered a single randomly-generated dataset, we distributed the work units between the toolsets in the following manner:
The objective of this case study was to examine how, if we were to vary the number of recipes enabled on tools, the corresponding wait times at tools would change. We simulated 4 different scenarios. In the most restrictive scenario, a work unit only had the option to be assigned to 1 tool (because only that tool had the recipe enabled). In the opposite, most generous scenario, a work unit could be assigned to any of the 4 tools within each toolset (because every tool has the corresponding recipe enabled).
Clearly, restricting the number of tools has unwanted consequences on waiting times. For all six types of tool that we considered, a decline in recipe availability increased the total wait time. This is a non-linear relationship, with by far the greatest difference in wait time found when availability shifted from two tools to just one. The impact of having just one tool available is also tool dependent. For the six types of tool considered, the increase in wait time over baseline varied from less than 250 percent to just over 700 percent.
Of course, wait time is only a part of overall cycle time. To investigate the impact of tool availability on total fab cycle times, we considered twenty datasets, each with a slightly different distribution of work units allocated to toolsets. For this investigation we maintained our requirement for 1000 work units, undertaken by six toolsets, each with four tools.
For this simulation, we flexed the recipe availability and calculated the change in cycle time. We found that when the tools were available and capable of running all recipes, flexibility was high as it could be, allowing the fab to run at its full throughput. Reducing tool availability by limiting recipes led to a significant increase in cycle time.
Plotted in the graph below are increases in cycle time resulting from a reduction in tool availability. These values are relative to the theoretical minimum cycle time, realised when all four tools are available and flexibility maximised. While there is a variation in impact across the 20 datasets, the trend is clear: when tool availability reduces, cycle time takes a significant hit. Averaging the results across all datasets (depicted by the bold line) shows that when the proportion of tools available falls to 25%, cycle time lengthens by more than 40%.
This pair of case studies, carried out with our smart scheduling software, has uncovered valuable lessons. While simplifying production by limiting the recipes run on toolsets may be tempting, that can cost a fab significant increase in cycle times. By utilising the "what-if" capabilities of our Optimization Scheduler, fab managers can run different scenarios for data-driven decision making and ultimately become more informed about the impact of their choices on the shop floor.