Abstract
There has been the introduction of architectures that are mutlicore in nature and for that reason; there has been the emergence of applications which are multi-threaded which are commonly susceptible to bugs of synchronization like data races. The development and use of software that are able to detect data races are known to incur a lot of costs due to increase in overhead costs. The use of hardware for the detection of the data race is a solution that can be used and is better and cheaper than that of software use. All the hardware solutions that have been proposed for race detection will depend on the happens-before algorithm. This algorithm is sensitive to interleaving and also is not able to detect races which have not been initially exposed to monitored run. There are many algorithms which have been devised to detect and solve this challenge. This paper will focus on the various strategies that will be used to solve the software process. There are various solutions that have been devised.
Introduction
The development of bugs has been an issue of concern for many researchers. This is the main reason as to why development algorithms that are used to solve these bugs have had exponential growth. The developments in multi-core architectures have further enhanced the trends of multi-threaded applications. Although there is the increase of performance with the use of multithreading technique, there is complexity in software development. There is also susceptibility of bugs that are associated with synchronization.
Hardware-Assisted Lockset-based race detection
There are many programming techniques that are used to improve the server and scientific applications and enhance their performance. One of the popular techniques is multithreading. Multithreading is a most commonly used programming technique that is in many cases applied in servers and other applications with an aim of boosting the performance of these applications. Appreciation and implementation of this technique has in the recent past been steered by the emergence of multi-core architecture.
Data race is one popular bug that is associated with synchronization. This bug occurs when two threads try to gain access to the same variable that has been shared and that one access is that of a write. The feature of determinism and sensitive to timing is one of the reasons that have caused data races to be difficult to expose and reproduce. It has also made it hard to undertake diagnosis on them. This means that the data races are in a position to stay and get integrated in programs which have been tested extensively and result in causing damages during the use of the programs. With the enhancements and developments in micro-architecture, there has been the development of the hardware community to adjust their budget in order to decrease the budget in bug detection and increase the ease with which bug detection is undertaken. There have been suggestions and proposals to make use of hardware support to come with race detection criteria. The use of these approaches is cheaper and better compared to software-use in the detection of data races. With the use of hardware support for the detection of data races, there has been enhanced performances and have enhanced slowdown rates to levels which are acceptable. With the use of these approaches, there has been the possibility of on-the-fly detection while the systems have been undertaken.
In this survey, the various race detection for bugs is categorized into the algorithm that is used in race detection. It is important to understand the algorithm that is in use. With this knowledge, it will be important to understand the role that the algorithm undertakes in the process. The algorithm which has been popular for the development of data race mechanisms are the happens-before algorithm. This algorithm monitors the execution of a program in a dynamic way and also orders the accesses to the memory basing this ordering on the order of synchronization and execution. A data race is said to be there if there is no temporal ordering that is visible between two accesses which are conflicting. One of the basic ideas while implementing the hardware solution for this algorithm is to have the storage of all the histories of the accesses in the cache of the hardware that have been implemented. The information which has been stored in the hardware underneath will be communicated with the use of coherence protocol and is used in the checking of the access anomalies that has some relations to the happens-before order. One disadvantage with the use of this algorithm is the fact that it can only be used in the detection of data races which get visible in the execution that is monitored. There are other races which get visible only during interleaving. One problem with this algorithm is that it becomes impossible to test every combination of thread interleaving. This means that the happens-before algorithm will not be used in this scenario. It will require that other uses be used.
For this reason, there was the development of lockset algorithm which was developed to overcome the limitation posed by happens-before algorithm is that of lockset. This algorithm, unlike happens-before, is able to detect data races that are not able to be seen and visible in a particular execution race. Also, with lockset algorithm, there is no sensitivity to scheduling. When the system is running, the algorithm will check if there are violations in the lockset discipline. An illustration of the locking discipline is that all access to a given shared resource should be controlled by the use of one lock which is common. For the control to be effective and useful, the algorithm will store the set of locks that the thread holds which are referred to as thread lockset. The algorithm will also store the set of locks which have already been used (called candidate set) for each variable that is shared. There is the addition of the lock or the deletion of the lock to the thread lock set if the thread releases a lock or takes up a lock. There is the initialization as all possible locks of the candidate set and there is the processing of the candidate locks once there is the change of the corresponding variable with the intersecting with the lock set. If there is an empty candidate set, there will be no common locks have been assigned for the protection of the variable and will mean that there is a potential race.
One problem with the use of lockset algorithm is that it is based software and, therefore, means that there is a lot of overhead in this aspect. There have been some reports of slowdowns of from 10 to 30 in some applications. There are various solutions which have been proposed for the solution of this to be achieved. The solutions that have been proposed have been based on the trade-off between race detection for the sake of enhancing performance against that of the use of object-oriented languages which can be used to increase the granularity of the monitoring process from that of the variable to that of an object. It has been found out that the use of object-oriented languages reduces the overhead of bug-detection to only 2 to 3 times of slowdown. This cannot be applied in code that is written in legacy settlements because of the fact that the level is still low to be used in production runs.
Although there have been many implementations of the happens-before algorithm, there is little that has been done to study the possibilities of implementing the hardware solutions of the lockset algorithm to exploit the data detection of this algorithm. This is coupled with the fact that there is low overhead that is involved in the detection of the data flaws. There is a need to exploit this algorithm.
AVIO
This is another solution that is common in determining bugs and races. This mechanism is generally used to detect the variations that come from atomicity. With the base on AI, this mechanism of AVIO detects the segments of code that is regarded to be important from the correct runs and then makes use the invariants to undertake the detection of online atomic bugs. With AVIO, there is no need to have prior programmer knowledge or annotations of programming in order to undertake this mechanism. When the operation is underway, there is the use of two properties of programs which are notorious and concurrent to simplify and make it easier the process of collecting the invariants that are being sought in the process. This process is easier when compared to that of traditional sequencing that is common for most programs. In AVIO, the thread whose atomicity has been interrupted is referred to as local thread and the accesses that it makes are referred to as local accesses. The bugs that come with atomicity is not different with the other different types of bugs that are experienced in other bug detection. The result of these bugs is through a mismatch between the intention of the programmer and the implementation. Programmers whose reasoning and thinking is sequential and think that shared-variable access is serializable and should be free from interference by accesses which cannot be serialized. If there is no implementation of these assumptions of atomicity of these assumptions, then there will be the emergence of a bug.
AVIO is a technology aimed at a comprehensive testing of bugs and other software flaws. To be more precise, this strategy is more focused on the detection of atomicity violations. Interleaving invariant is a fact in the AVIO technology that is entirely dependent on the many assumptions made by the programmers. AVIO can be implemented in two ways: the first one is wholly software based in carrying out its services while the other requires some extension to the hardware of the system and specifically to the cache memory. Testing of the two options qualifies the software one for full use as an in-house development while the other that cares about the hardware can be extensively applied.
As far as AVIO is concerned, implementation of this technology focuses so much on atomicity as opposed to the freedom from battles of data. However, this is not an assurance of achievable synchronization. Occurrence of bus still remains unavoidable despite the high efforts to ensure full protection. Future projections in technology rarely identify data battles as a challenge, atomicity violation has always been identified as one. Programmers have shifted towards memory management in terms of software and hardware.
Concurrency bug characteristics
Research has in the past been done on the existing and projected program bugs. However, not so much attention has been directed towards concurrency bugs in as much as they have been notorious bugs in the programming world. This explains why the detection mechanisms set in place constantly ignore or rather are ineffective on concurrency bugs, a clear indication that in the programming world people constantly fail to ,earn from their past mistakes. One could say that by the many occurrences of bugs there has been an increase in firms that focus on bug detection; this is not enough if the main bugs are still being ignored.
It’s also worth noting that very few researchers have conducted studies on real experiences of concurrency bugging in the programming world. In the recent days, the importance of this study dawned on researchers who then did some shallow studies into the matter that may not be as effective in coming up with concrete strategies to counter the bugs. Some of the reasons affiliated to the failure of researchers to focus more on the concurrency bugs also don’t seem so much convincing. One of these is the fact that less reporting is made on the occurrences of these bugs. Clearly, any interested researcher would go further that reports to get the facts at hand. There is also the claim that these types of bugs are not easy to understand. This is a weird claim in the computing world since systems cannot be subjected to risks because a concept is difficult to understand.
The characteristics of bugs are based in three categories. These are the pattern of the bug, the manifestation of the bug, and the strategy for fixing the bug.
Bug pattern
The different types of bug characteristics require the use of different bug detection and diagnosis. The bugs that are not concurrency and non-deadlock can be categorized into three: Atomicity, order, and other categories. One thing to note is that the different categories are differentiated from each by the real cause of the bugs.
It has been found out that programmers engage their atomic intentions on regions of atomicity and orders that are used for executions but the enforcing of these executions is not easy and in a complete manner.
The fact that programmers think in a sequential manner, they think that the regions which have small code will be executed in a sequential manner. Programmers are also known to think of an order between two different operations that are from different threads, but it is common for programmers to forget to undertake such an order. With this, one of the two operations may operate faster than the expectations of the programmer and it will make the order bug to be evident and visible fast. Programmers are also known to think of an order between two different operations that are from different threads, but it is common for programmers to forget to undertake such an order. With this, one of the two operations may operate faster than the expectations of the programmer and it will make the order bug to be evident and visible fast. The concurrency bugs that are found to go against the intentions of programmers have been found to be in existent. This bug is rare.
The concurrency bugs that are found to go against the intentions of programmers have been found to be in existent. This bug is rare. There is a version of MySQL where programmers used timeout threshold fatal_timeout so that they are able to detect the bug. If the thread will wait for the lock for a time exceeding fatal_timeout, then the server will crash. If the programmer will decide to set the threshold, then the workload will be under-estimated. When this was implemented continuously, it was found out that the MySQl server kept on crashing every now and then. With this assumption, it is neither atomicity intention nor order intention. The limitations and solutions of this bug are found by having worker-threads to be limited.It is common to have programmers have order intention between a write and read that are done in one variable.
CTrigger
This strategy makes use of one type of interleaving called unserializable interleaving. This type of interleaving is related to bugs of atomicity violations. This type of interleaving is not related to any type of sequential execution of the involved operation. Given the fact that atomicity is the same as serializability in the essence of concurrency bugs, the focus on unserializable interleaving can give a good coverage for the bringing out of violations of atomicity and allow people to decrease the testing space.
This is a study that focuses both on the interleaving aspects of program testing as a move towards gaining more understanding of atomicity violations and also digs deeper into such occurrences and exposes them to the programming world for future care of the systems. It does t so perfectly that it touches so much on the real world occurrences, which are likely to occur again hence a very helpful study towards effective program development and design.
This study is based on the possibility that a bug that has once occurred can reoccur again. A deeper analysis into such occurrences therefore puts the programmers at a better position to carry out diagnostic measures that are crucial to avoidance and prevention of such bugs from having their way into the programs.
Atomicity is one feature that is common in many concurrent operations when the data manipulation of data is the same as their serial execution. One assumption of programmers is that they take some regions to be atomic. One downside is that the implementation of the programmers may not guarantee to have atomicity. Consequently, the atomicity can be eradicated by the way of breaking by way of interleaving the regions in unserializable manner. The different types of bug characteristics require the use of different bug detection and diagnosis. The bugs that are not concurrency and non-deadlock can be categorized into three: Atomicity, order, and other categories. One thing to note is that the different categories are differentiated from each by the real cause of the bugs.
It has been found out that programmers engage their atomic intentions on regions of atomicity and orders that are used for executions but the enforcing of these executions is not easy and in a complete manner.
As noted earlier, there are so many similarity aspects that have been noticed in the way programmers work and also think. This is in relation to the theory that has been established by programmers, with a common thinking that small codes undergo a sequential execution. This stems from the recent wide appreciation of the hashing concept in the programming world. Programmers are also known to think of an order between two different operations that are from different threads, but it is common for programmers to forget to undertake such an order. With this, one of the two operations may operate faster than the expectations of the programmer and it will make the order bug to be evident and visible fast. The concurrency bugs that are found to go against the intentions of programmers have been found to be in existent. This bug is rare.
InstantCheck
There has been a constant challenge in the computing world when it comes to development of multithreaded systems especially when their design calls for them to work on a platform where memory is shared. Instant check, a strategy for efficiency in handling of bugs proposes for determinism that is to be carried out on runtime and that which utilizes less of the program’s hardware in carrying out this testing. Codes are tested for their deterministic or nondeterministic nature Such a check is almost entirely focused on the external deterministic nature of codes and more especially those whose running is in a parallel manner. This is opposed to the traditional way of carrying out internal determinism that was not considered as effective as the present day instant check. This is made possible by a constant and numerous random running of the codes in pursuit for an entirely correct code at the program testing stage.
One problem that has been found out with the use of parallel programming is that of non-determinism. This means that one input has resulted in various and numerous outputs. This has made programming difficult to achieve. One challenge with this aspect is that it is hard to determine what an individual program undertakes. The presume input is in a fixed state. The non-determinism of interaction of threads is a problem that has been found in shared memory access. InstantCheck is a mechanism that seeks to replace the enforcement of internal determinism. InstantCheck is a technique that has been proposed to check external determinism in programming technique. This is the reason for its implementation.
SigRace
The idea behind this is to have record the addresses that are addressed by the processor in hardware signatures. With this, the signatures and the epoch are passed at automatic times to a module referred to as Race Detection Module that is programmed to be on-chip. The Race Detection Module will keep the signatures and also the timestamp in an order of in-order queue.
Detection of data races or rather battles in a system that runs its codes in a parallel manner is crucial to effective software development. SigRace is a hardware assisted approach to data race detection and management that entirely relies on the addresses and identification signatures of various hardware for full functionality. This works in conjunction with the processor which keeps records all the addresses at run time in the form of signatures. These are then linked to records from other processors to assess occurrences of data races.
Light64
Light64 is a technique used in the programming world to achieve effective detection of data races at the time the system is running. As opposed to the afore mentioned technique, this technique calls for testing that takes into consideration the issues of runtime and its necessary it be small and also hardware specifications need to be lightweight for efficiency to be achieved. It also appreciates hashing in the overall implementation of its strategy for programmers. Just like the other techniques, it focuses on achieving effectiveness in the computing world, as far as parallel systems and programs are concerned.
The idea behind this strategy is that if we flip the order of the accesses which are racing, there is a high likelihood of changing the execution of execution of the program. Based on this aspect, Light64 will search for two different thread executions which have the same graph of happens-before. The access threads will also be required to have different memory access regarding a different access. This will enable the algorithm to be effective. Although this is the case, there should be enough alteration to enable the system to know that some deviation has taken place.
iWatcher
With the advancements and changes experienced in the computing world, program testing and detection of bugs has not been made any easier. On the contrary, there has been a hardly impressing problem in detection of bugs that most of the programmers are experiencing. An introduction of the Intelligent Watcher (iWatcher) is a move towards achieving some sense of ease in bug detection that is yet to be achieved. This strategy pays less attention to issues of overhead in carrying out its activities. It on the contrary focuses on reducing issues of overhead for as far it is possible.
With this strategy, there is the use of intelligent watcher that is used in the monitoring of execution but has minimal costs of execution. The overhead is minimal with the use of this method. It is also found to be flexible. The strategy ensures that the monitoring functions that are program-nature are associated with some memory. When these locations are by chance accessed, there will be the igniting of the function of monitoring with minimal costs of overhead.
In order to reduce costs further and to be able to support rollback, iWatcher can further implement Thread-Level Speculation (TLS). There are advantages that come with this. One of the advantages of iWatcher is that it has the ability to monitor all access to the locations that are watched. Consequently, it is able to catch all bugs which are hard to find. These bugs include updates that are available through alias pointers and attacks that are stack-mashing which are commonly exploited by viruses.
The other advantage is that it has low costs of overhead because of the fact it monitors only locations which are genuinely watched unlike the other strategies which monitor locations which are sometimes not monitored. There is also some flexibility with this strategy because of the fact that it is able to support a wide range of checks. These include checks which are specific for some programs. In addition to all these, iWatcher can work in different languages and across many modules.
ReEnact
This is another strategy that is used to undertake and solve data races in multi-threaded programming environments. The use of this strategy will require the use of epochs by the ordering process using the process of synchronization. If there is the present of races, this will be represented as communication that is taking place between epochs and which have not been ordered.
Two access procedures and programs are considered to be in a state of conflict if both of the access programs access the same location and if one of the access programs is a store. There are well created and defiend synchronization operations that have been designed in many programs that are used to control and order conflicts. If there are no such operations, there will be the occurrence of data races. This is when there is some non-determinism that has been caused by the execution of a program and this situation is considered to be a bug.
Given the fact ReEnact makes use of synchronization operations that are available in the program to order epochs, when there are no data races available, all sharing of data across threads take place because of the epochs which are available and already ordered. So that there is no commitment of epochs by the ReEnact as long there is still an epoch which is still running.
Tolerating Concurrency Bugs Using Transactions as Lifeguards
With this strategy, there is a new method of testing thread interleaving with the use of lifeguard transactions (LifeTxes). A code which is not tested is found in single LifeTxes. When the code region continues to be tested for interleaving of threads, the LifeTx files are initially split to many smaller LifeTxes so that the threads that are tested are allowed to be included in a production run.
Detecting and surviving data races using complementary schedules
The main issue with the strategy is the use of frost. With this strategy, there is the execution of replicas with the use of schedules which are complementary. The use of hardware for the detection of the data race is a solution that can be used and is better and cheaper than that of software use. All the hardware solutions that have been proposed for race detection will depend on the happens-before algorithm. This algorithm is sensitive to interleaving and also is not able to detect races which have not been initially exposed to monitored run. The main idea behind the frost system is the execution of complementary schedules. This is in a bid to enhance effective detection of bugs that are caused by data races. Algorithms for synchronization are applied to ensure effectiveness of the frost system. This is so because data race bugs that action at intervals hence call for much efficiency in the concerned systems for them to be detected lest they will survive the program testing stage.
Parallelizing Data Race Detection
It is a challenge to have fine analyses like data race detection. In normal situations the analysis is done in common and parallel with the execution of the program. One strategy that can be used to achieve parallel race detection is the use of race detector along with the application which has been scaled.
Another way in which parallelization can be achieved is by making an application so that it is able to log all accesses to memory and then ensure that the race is fed to multiple threads that undertake the detection of races. One challenge with this approach and strategy is that the memory logging process must be undertaken for all memory accesses and operations which will become a challenge for performance. This will lead to an increase in overhead costs.
Conclusion
It’s true that the process of detecting bugs and eventually removing them from the codes is a practically time consuming approach to program testing. This is so especially in cases where programs are run in parallel and data races is a possibility. Thread level speculation offers new ways of doing this to ensure fast but still effective debugging of programs. This strategy also works on a deterministic foundation.
References
Lu, S., Park, S., Seo, E., & Zhou, Y. (2010, May 24). Learning from mistakes - A comprehensive study on real world concurrency bug characteristics. University of Illinois , pp. 2-5.
Lu, S., Tucek, J., Qin, F., & Zhou, Y. (2006, April 12). AVIO: Detecting Atomicity Violations via Access Interleaving Invariants (2006). University of Illinois , pp. 1-9.
Muzahid, A., Suarez, D., & Qi, S. (2009, April 23). SigRace: Signature-Based Data Race Detection. University of Illinois , pp. 1-14.
Nistor, A., Marinov, D., & Dorrellas, J. (2009, May 12). Light64: Lightweight hardware support for data race Detection during Systematic Testing of Parallel Programs. University of Illinois , pp. 1-16.
Nistor, A., Marinov, D., & Torrellas, J. (2009, March 23). InstantCheck: Checking the determinism of parallel programs using on-the-fly incremental hashing. Illinois Periodicals , pp. 1-10.
Park, S., Lu, S., & Zhou, Y. (2008, May 23). CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. University of Illinois , pp. 3-23.
Prvulovic, M., & Torrellas, J. (2009, July 3). ReEnact: Using Thread-Level Speculation Mechanisms to Debug Data Races in Multithreaded Codes. University of Illinois , pp. 3-7.
Veeraraghavan, K., Chen, P., Flinn, J., & Narayanasamy, S. (2009, December 21). Detecting and Surviving Data Races using Complementary Schedules. University of Michigan , pp. 1-6.
Wester, B., Devecsery, D., Chen, P., Flinn, J., & Narayanasamy, S. (2010, November 30). Parallelizing Data Race Detection. University of Michigan , pp. 2-7.
Yu, J., & Narayanasamy, S. (2010, December 12). Tolerating Concurrency Bugs Using Transactions as Lifeguards. University of Illinois , pp. 3-9.
Zhou, P., Quin, F., Liu, W., Zhou, Y., & Torrellas, J. (2010, March 12). iWatcher: Efficient Architectural Support for Software Debugging. University of Illinois , pp. 1-9.
Zhou, P., Teodorescu, R., & Zhou, Y. (2009, March 12). HARD: HardwareAssisted Locksetbased Race Detection. University of Illinois , pp. 1-12.