Google ThreadSanitizer -- 排查多线程问题data race的大杀器

微策略中国

在设计大型软件时,我们通常会引入多线程技术来满足性能和响应时间的需求,但这同时也带来了非常棘手的多线程问题,这些问题往往有着诡异的行为,薛定谔的重现概率和令人头秃的修复难度。如果要问码农最怕遇到的问题有哪些,多线程问题必须拥有姓名,而data race就是最常见的一种多线程问题。本文将从data race的介绍、Google ThreadSanitizer介绍、基本算法以及使用情况四个角度为大家详细分析Google ThreadSanitizer这一排查data race的大杀器,如果你现在或将来要和多线程打交道,请一定不要错过。

assignment detected data race

什么是data race

Data race(译作数据竞争,或数据争用)是在设计多线程程序时可能遇到的问题。Data race发生的条件是:

  • 在同一个进程中有两个或更多的线程并发的访问同一内存地址。
  • 至少有一个线程的访问时写(write)操作。
  • 这些线程没有使用合适的同步机制来控制对这个内存地址的访问。

当上述3个条件同时满足时,对这个内存地址的访问顺序是不固定的,所以可能产生不符合预期的行为。因为这样的随机性和不可预期性,data race可能在整个产品的测试周期中都没有被触发,但是交付上线之后可能导致难以解释的奇怪行为。

比如下面的这个函数,在单线程的情况下是没有问题的,但是在多线程的环境下调用存在data race的风险:

assignment detected data race

那要如何解决data race这个问题呢?我们就引入了Google ThreadSanitizer这个大杀器!

Google ThreadSanitizer介绍

ThreadSanitizer是Google大名鼎鼎的消毒剂家族(AddressSanitizer, LeakSanitizer, MemorySanitizer, ThreadSanitizer, HWASAN, UBSan)中的一员。

主要功能:在运行时检测data race,当然还有一些其他的功能比如检查死锁,是否有未回收的线程等。

支持语言 :C/C++, GO, swift. 支持平台 :Linux, Mac OS, FreeBSD, NetBSD.

Google ThreadSanitizer基本算法

ThreadSanitinzer在执行时把程序看成一系列的事件(Event),而ThreadSanitizer主要关心两类事件:内存访问事件(memory access event)和同步事件(synchronization event)。其中内存访问事件就是内存的读(read)和写(write),同步事件包括琐事件(locking event)和happens-before事件。其中琐事件包括RdLock(读锁加锁),RdUnlock(读锁解锁),WrLock(写锁加锁),WrUnlock(写锁解锁),而happens-before事件包括SIGNAL和WAIT。

ThreadSanitizer在内部会维护一个状态机,并监控程序产生的这些事件来更新状态机的状态来检测data race。在介绍算法之前我们需要了解一些基础概念。

Tid (Thread ID): 标识一个线程的唯一ID

ID :标识一个内存地址的唯一ID,其实就是内存地址

EventType :就是READ,WRTIE,RdLock,RdUnlock,WrLock,WrUnlock,SIGNAL,WAIT中的一种

Event :用三元组EventType,Tid,ID表述的事件,可以表示为EventTypeTid(ID)

Lock :在琐事件中出现的ID,可以理解为锁变量的内存地址

Lock Set(LS) :即Lock的集合

Segment :一个线程中的内存访问事件的有序序列,注意不包括同步事件。

Happens-before arc :对于一对事件X=SIGNALTx(Ax)与Y=WAITTy(Ay),如果Ax=Ay,且Tx≠Ty, 且X被ThreadSanitizer观察到先于Y发生,则称{X, Y}是一个happens-before arc

Happens-before :指事件集合的偏序关系。给定两个事件X=TypeXTx(Ax)与Y=TypeYTy(Ay),当ThreadSanitizer观察到X先于Y发生并且下列3个条件中任意一个成立时,我们说X happens-before Y(记做X≺Y):

  • Tx = Ty (即X和Y在同一线程,显然同一个Segment中的事件满足happens-before的条件)
  • {X, Y}是一个happens-before arc,
  • ∃ E1 ,E2 : X ≼ E1 ≺ E2 ≼ Y (即happens-before具有传递性,这里 X≼E 表示 X=E或X≺E).

我们可以用下图方便理解happens-before:

assignment detected data race

  • Segment S1和S4都在线程T1中,且S1先被ThreadSanitizer观察到,所以S1≺S4,类似的,S2≺ Wait(H1) ≺S5≺ Signal{H2) ≺S6,S3≺ Wait(H2)≺S7。
  • 图中有2对happens-before arc:{Signal(H1), Wait(H1)}, {Signal{H2), Wait(H2)},所以自然地有Signal(H1) ≺ Wait(H1), Signal{H2) ≺ Wait(H2)。
  • 根据传递性,我们可以得到更多事件的happens-before关系,例如S1≺S5≺S7,S2≺S7。

Segment Set : 一个N个segment的集合{S1, S2, S3 ... SN}且满足∀i,j: Si≴Sj(这里Si≴Sj表示Si⊀Sj并且Si≠Sj)。显然对于上图,{S1, S2, S3}是一个segment set, {S4, S5}也是一个segment set,但是{S1, S4}和{S1, S5}都不是segment set。

Concurrent :对于2个内存访问事件X和Y,如果X⊀Y并且Y⊀X(即X与Y的发生顺序不确定),并且X的Lock Set与Y的Lock Set没有交集(即X与Y没有受同一个lock保护),则称X与Y是concurrent(并发的)。

Data race :如果存在两个内存访问事件是concurrent(并发的),且其中任意一个是写(WRITE)事件,则存在data race。

ThreadSanitizer的状态包括global状态和per-ID状态。Global状态包括了ThreadSanitizer观察到的所有同步事件(lock sets,happens-before arcs)。而per-ID state记录指定内存地址的运行信息,它包含2个segment sets:SSWr(写segment set)和SSRd(读segment set)。给定ID的SSWr是已经出现的对这个ID的所有写(WRITE)事件的segment set。而给定ID的SSRd是已经出现的对这个ID的读(READ)事件的segment set,并且要求满足∀Sr ∈ SSRd ,Sw ∈ SSWr : Sr ≴ Sw(也就是说对于SSRd中的任意segment都不能happens-before或等于SSWr中的任意segment,这个条件存在的原因是如果一个读事件happens-before SW,则这个读事件的结果一定不会受SW中写事件的影响,反之则不然)。

每当一个在线程Tid中对内存ID的访问事件(WRITE或者READ)发生时,ThreadSanitizer都会执行下面的逻辑。首先根据当前的事件更新SSWr和SSRd,然后依据SSWr和SSRd判断是否有data race。

assignment detected data race

具体检查data race的逻辑如下:

assignment detected data race

可以看到ThreadSanitizer的基本算法其实很简单,只要理解了基础概念理解算法没有什么困难。ThreadSanitizer的状态机在具体实现时做了很多改进,根据不同的性能和准确度要求,混合状态机有不同的变体,这里限于篇幅不再展开,感兴趣的读者可以阅读相应的参考文献。

Google ThreadSanitizer的使用

了解了基本定义之后,让我们一起来看看它的使用情况。

以下图中代码为例,全局变量Global在主线程和线程t中都被执行了写(WRITE)操作,并且这两个写操作是并发的(concurrent),所以这里存在data race。

assignment detected data race

现在我们利用ThreadSanitizer来检测一下这段代码。首先我们需要在代码编译阶段加入ThreadSanitizer。ThreadSanitizer已经被集成在clang 3.2和gcc 4.8及更高的版本中,在编译时加上参数-fsanitize=thread即可,加上参数-g可以在输出的警告信息中显示文件名和行号。

assignment detected data race

可以看到警告信息包括了产生data race的调用堆栈,行号和变量名,非常方便找到出问题的代码。

ThreadSanitizer是在运行时通过监控事件来检查data race,没有执行过的代码是不会被检查到的,所以需要设计完备的测试用例来覆盖尽可能多的分支。ThreadSanitizer会导致程序的内存占用提高5~10倍,运行速度减慢2~20倍。如果要利用ThreadSanitizer检查data race,原则上要求所有的代码都使用-fsanitize=thread进行编译,否则可能会产生误报或漏报,或者在警告信息中打印不完整的调用堆栈。

以上就是我们今天的全部内容了,如果觉得有所感想或者有什么问题,欢迎在评论区告诉我们噢~

  • https:// static.googleusercontent.com /media/research.google.com/en//pubs/archive/35604.pdf
  • https:// github.com/google/sanit izers/wiki/ThreadSanitizerCppManual
  • https:// github.com/google/sanit izers/wiki/ThreadSanitizerPopularDataRaces
  • https:// clang.llvm.org/docs/Thr eadSanitizer.html
  • http:// code.google.com/p/data- race-test

我们会每周推送商业智能、数据分析资讯、技术干货和程序员日常生活,欢迎关注我们的知乎公众号“微策略中国”或微信公众号“微策略 商业智能"。

Data races in Go(Golang) and how to fix them

Go is known for how easy it is to build concurrent programs in it. But, with all this concurrency, comes the possibility of the dreaded data race – one of the hardest bugs to debug if you’re ever unfortunate enough to encounter it in your code.

In this post, we will go through a sample program that causes a data race, and detect the race condition with the race detector tool. We will then look at some of the methods to get around and solve the race condition, while still keeping the core logic of our code intact.

The Data Race #

Rather than explaining what a data race is, let’s look at a sample piece of code:

Try it here

Here, we can see that the getNumber function is setting the value of i in a separate goroutine. We are also returning i from the function without any knowledge of whether our goroutine has completed or not. So now, there are two operations that are taking place:

  • The value of i is being set to 5
  • The value of i is being returned from the function

Now depending on which of these two operations completes first, our value printed will be either 0 (the default integer value) or 5 .

This is why it’s called a data race : the value returned from getNumber changes depending on which of the operations 1 or 2 finish first.

👆 Data race with read finishing first

👆 Data race with write finishing first

As you can imagine, its horrible having to test and use code which acts differently every single time you call it, and this is why data races pose such a huge problem.

Detecting a Data Race #

The code we went through is a highly simplified example of a data race in action. In larger applications, a data race is much harder to detect on your own. Fortunately for us, Go (as of v1.1) has an inbuilt data race detector that we can use to pin point potential data race conditions.

Using it is as simple as adding a -race flag to your normal Go command line tools.

For example, let’s try to run the program we just wrote by using the -race flag:

This is what I got as the output:

The first 0 is the printed result (so we now know that operation 2 finished first). The next few lines give us information about the data race that was detected in out code. (The line numbers may not correspond to the sample code above since the actual code will have imports and package declarations)

We can see that the information about our data race is divided into three sections:

  • The first section tells us that there was an attempted write inside a goroutine that we created (which is where we assign the value 5 to i )
  • The next section tells us that was a simultaneous read by the main goroutine, which in our code, traces through the return statement and the print statement.
  • The third section describes where the goroutine that caused (1) was created.

So, just by adding a -race flag, the go run command has explained exactly what I explained in the previous section about the data race.

The -race flag can also be added to the go build and go test commands.

It’s so easy to detect a potential race condition in Go, that I can’t think of any reason not to include the -race flag when building your Go application. The benefits far outweigh the costs(if there even are any) and can contribute to a much more robust application.

Fixing Data Races #

Once you finally find that annoying data race, you’ll be glad to know that Go offers many options to fix it. All of these solutions help to ensure that access to the variable in question is blocked if we are writing to it.

Blocking with Waitgroups #

The most straightforward way of solving a data race, is to block read access until the write operation has been completed:

Blocking with Channels #

This method is similar in principle to the last one, except we use channels instead of waitgroups:

Blocking inside the getNumber function, although simple, would get troublesome if we want to call the function repeatedly. The next method follows a more flexible approach towards blocking.

Returning a Channel #

Instead of using channels to block the function, we could return a channel through which we push our result, once we have it. Unlike the previous two methods, this method does not do any blocking on its own. Instead it leaves the decision of blocking up to the calling code.

Then, you can get the result from the channel in the calling code:

This approach is more flexible because it allows higher level functions to decide their own blocking and concurrency mechanisms, instead of treating the getNumber function as synchronous.

Using a Mutex #

Until now, we had decided that the value of i should only be read after the write operation has finished. Let’s now think about the case, where we don’t care about the order of reads and writes, we only require that they do not occur simultaneously . If this sounds like your use case, then you should consider using a mutex :

We can then use GetNumber just like with other cases. At first glance, this method may seem useless, since we still do not have any guarantee as to what the value of i will be.

👆 Mutex with write locking first

👆 Mutex with read locking first

The true value of the mutex shows when we have multiple writes , which are intermixed with read operations. Although, you will not need mutexes in most cases, since the previous methods work well enough, it helps to know about them for these kinds of situations.

Conclusion #

Any of the above methods will prevent the data race warning from appearing when you run a command with the -race flag. Each method has different trade-offs and complexity, so you’ll have to evaluate the pros and cons based on your use case.

For me, if i’m ever in doubt, waitgroups normally solve the problem with the least amount of hassle.

The core principle behind all the approaches explained in this section, is to prevent simultaneous read and write access to the same variable or memory location

So, as long as you keep that in mind, you’re good to go 👍

Search code, repositories, users, issues, pull requests...

Provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sigbart code -6 in tid 21675 (RenderThread) : After using LiveData #208

@vipulyaara

vipulyaara commented Oct 16, 2017 • edited

  • 👍 2 reactions

@yigit

yigit commented Oct 19, 2017

Sorry, something went wrong.

vipulyaara commented Oct 23, 2017 • edited

@Endi327

Endi327 commented Jun 18, 2019

  • 👍 4 reactions

@ianhanniballake

ianhanniballake commented Feb 19, 2020

@ianhanniballake

No branches or pull requests

@yigit

Polyspace Static Analysis Notes

What Are Data Races and How to Avoid Them During Software Development

Data races are a common problem in multithreaded programming. Data races occur when multiple tasks or threads access a shared resource without sufficient protections, leading to undefined or unpredictable behavior.

When you author software to simultaneously handle multiple tasks, you may use multithreaded programming, that is, programs with constructs such as multiple entry points, interleaving of threads, and asynchronous interrupts. However, multithreaded programming can be highly complex and introduce subtle defects such as data races and deadlocks. When such a defect occurs, it can take a long time to reproduce the issue and even longer to identify the root cause and fix the defect.

Example of a Data Race

Let us start with the simplest example of a data race. In the following diagram, Task1 and Task2 write values to the shared resources, sharedVar1 and sharedVar2 . The tasks later read the values of the shared resources through the functions, do_sth_with_shared_resources1() and do_sth_with_shared_resources2() . Let us begin with a simple situation that has no protection mechanisms established in the operations.

Figure 1. Simultaneous access to shared resources by two tasks without specific protection

You may ask: what value of sharedVar1 does the function do_sth_with_shared_resources1()read? You may expect the value to be 11 since this value was written in Task1 immediately before the function call. However, without any protection mechanisms, the value read may be 21, or in some situations, even a corrupt random value. Because of the concurrent execution of Task1 and Task2 , the shared resource sharedVar1 may be rewritten in Task2 before being read again in Task1 .

In other words, both sequences can happen:

  • Task1: sharedVar1 = 11;
  • Task1: do_sth_wth_shared_resources1();
  • Task2: sharedVar1 = 21;

Without imposing protection mechanisms, any code you write in do_sth_wth_shared_resources1() cannot rely on a particular sequence occurring and, therefore, a particular value of sharedVar1. If your code relies on a particular value of sharedVar1, then the data race becomes a bug.

Data races occur when a shared resource is unpredictably accessed by multiple tasks. Data races may not be easy to understand because the execution of instructions does not follow the sequence in which the instructions are written. Also, the result can change in each test run, making a data race difficult to reproduce and fix.

How to Prevent Data Races with Mutual Exclusion Locks (Mutexes)

A common mechanism to avoid data races is to force a mutual exclusion. In the previous example, you can enforce sequence 1 by:

  • Locking a mutex before Task1: sharedVar1 = 11;
  • Unlocking the mutex after Task1: do_sth_wth_shared_resources1();

Other tasks, such as Task2 , have to wait for the mutex to be unlocked before accessing sharedVar1 ; however, the placement of mutex locks and unlocks is not as simple as it sounds. Here is a C code example that implements the tasks shown in Figure 1 with the POSIX-based pthread_ family of functions. The example attempts to protect against data races by using functions such as pthread_mutex_lock and pthread_mutex_unlock to lock and unlock a mutex.

You can see this full code example to review the details.

The code starts two threads, each with its own temporary variable tmp . The temporary variable reads the value of a shared resource ( sharedVar1 or sharedVar2 ) immediately after the resource is written. The write and subsequent read operations are protected using mutexes. As a result, the values of the temporary variable and the shared resource are expected to be the same. If the values do not agree, the threads print a message such as thread:1, sharedVar2 = 22 and tmp = 12 differ .

You can look at the code to review the details, or run the above code in a real environment for the following results.

Figure 2. Data race seen in program output

You can see that the message for unintended values, thread:1, sharedVar2 = 22 and tmp = 12 differ , appears several times. Despite the placement of mutexes, the data race continues to occur.

Debugging such data race in a real application can take several hours because of the non-deterministic nature of the issue. As you can see in Figure 2, the message for unintended values appears only sporadically. Also, once reproduced, the issue can be difficult to fix. It is not sufficient to simply use mutexes: their placement in the code is also critical.

How to Detect and Fix Data Races

A static analysis tool that automatically detects data races and suggests possible fixes can save a lot of debugging effort.

To understand why the data race continues to occur in the above example despite the use of mutexes, we used the data race checkers of a static analysis tool, Polyspace Bug Finder™. This tool can detect the data race that we saw earlier through the program output.

Figure 3. Data race on shared resource sharedVar2 from two tasks

Figure 3. Data race on shared resource sharedVar2 from two tasks

Figure 4. Checking the program flow with Access Graph

In Figure 4, you can see the program control flow that leads to each operation. The circles marked with ‘t’ show the beginning of two different tasks, task_main::thread1() and task_main::thread2() . The subsequent circles show how the control flow goes through functions, thread1_main and thread2_main , and eventually to the write operations. A shield icon on the write operation in the second task indicates that some protection mechanisms are used on this operation. The absence of a similar icon in the first task confirms the earlier suggestion that write operations on sharedVar2 are not protected in this task.

From this suggestion, you can check the function thread1_main and see that the mutex in this function is prematurely unlocked before all shared resources are accessed. You can change the placement of the mutex so that it occurs after sharedVar2 is accessed, and fix the data race.

Figure 5. Resolving the issue by changing the timing of unlocking mutex

Figure 5. Resolving the issue by changing the timing of unlocking mutex

In the example from the section above, you can spot the data race during a visual inspection, but in real applications of hundreds of files and thousands of lines of code, data races can be difficult to detect because:

  • Problems occur sporadically and can be hard to reproduce
  • Results can differ for each run. Even setting a breakpoint with a debugger can influence the result.
  • Incorrect placement of mutexes may not fix the root cause or may introduce other problems such as deadlocks or double locks

It is important to run a static analysis tool at a regular cadence to identify data races as soon as possible. A static analysis tool creates an abstraction of the concurrency model used in your program, and it can easily detect whether the established protections are sufficient to prevent data races.

Polyspace Bug Finder offers several features to identify concurrency issues such as data races and deadlocks, along with features that ease their review, such as the above textual and graphical representation of conflicting operations. These features help you identify the root cause of a data race more easily.

In the next post, we’ll look at another common concurrency issue known as deadlock.

Written by Yoo Yong-chul and Anirban Gangopadhyay.

Yoo Yong-chul works as an application engineer at MathWorks Korea and is responsible for code verification products.

Anirban Gangopadhyay works as a documentation writer at MathWorks US. He oversees technical documentation of Polyspace ® products.

Original post: Naver blog post

MathWorks Korea 2021.4.17

Static Analysis with Polyspace Products

assignment detected data race

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list

How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

  • América Latina (Español)
  • Canada (English)
  • United States (English)
  • Belgium (English)
  • Denmark (English)
  • Deutschland (Deutsch)
  • España (Español)
  • Finland (English)
  • France (Français)
  • Ireland (English)
  • Italia (Italiano)
  • Luxembourg (English)
  • Netherlands (English)
  • Norway (English)
  • Österreich (Deutsch)
  • Portugal (English)
  • Sweden (English)
  • United Kingdom (English)

Asia Pacific

  • Australia (English)
  • India (English)
  • New Zealand (English)
  • 简体中文 Chinese
  • 日本 Japanese (日本語)
  • 한국 Korean (한국어)

Contact your local office

Alperen Görmez

Alperen Görmez

I am a Ph.D. student at University of Illinois Chicago, doing research on deep learning. Specifically, I am working on the trade-off between the computational cost and the performance of deep learning models.

  • Chicago, IL
  • Google Scholar
  • Custom Social Profile Link

Avoiding Data Races in Multithreaded Programming

4 minute read

In Spring 2022, one of the courses I took was ECE 566 - Parallel Processing. As part of the course, we were tasked with presenting a topic of our choice. Thomas and I chose to present on data races and data race detection methods. Our slides covered the basics of data races, data race prevention methods and different methods that can be used to detect data races. Recently, I decided to brush up on these ideas and I wanted to write a blog post based on our slides .

Let’s start with the concept of multithreading. Multithreading is a technique that allows multiple tasks to be executed concurrently in a single program. The program is divided into multiple threads, and each thread runs independently. Threads share the same memory space and resources as the main process. They can access and modify the same data. If the data that is accessed by multiple threads is not protected sufficiently, hard-to-debug issues may occur. These issues can be difficult to reproduce, find the cause of and fix. Data races are one common type of such bugs.

Data race occurs when

  • Two or more instructions from different threads access the same memory location concurrently,
  • At least one of these accesses is a ‘write’,
  • There is no synchronization that mandates any particular order among these accesses. When these three conditions hold, the order of access is nondeterministic and the program behaves in an unpredictable way. Consider the following example:

In this example, thread_1() and thread_2() are assigning different values to the same global variable. If these two threads are running concurrently, the order in which they execute their assignments is non-deterministic and the result of the program may change in each run. If the program does a critical operation depending on the value of the shared resource, the data race becomes a bug.

So, how do we prevent data races? We need a protection mechanism. Mutual exclusion locks, or mutexes, serve this purpose. A mutex ensures that only one thread can access a critical section of code at a time. Other threads are prevented from accessing the critical section until the first thread is done with it. It is important to unlock the mutex once the first thread is finished, otherwise other threads will be blocked indefinitely.

Now that we know about data races and how to prevent them, let’s look at the ways of detecting data races. Data race detection tools can be grouped into two categories: static analysis tools and dynamic analysis tools.

Static Analysis Tools

Static analysis tools analyze the source code of a program to find potential data races. They work at compile time, which means they do not need to run the program. Static analysis tools can be very effective at finding data races, but the drawback is that they can be very noisy due to high number of false reports.

Static analysis tools first discover the shared variables in the program. Then, they run a lockset analysis to find the potential places for data races. A lockset is a set of locks that must be held by a thread in order to access a shared variable safely. If two threads are accessing a shared variable with overlapping locksets, then there is a potential for a data race. Finally, static analysis tools run a warning reduction step to reduce the number of false reports. This step is typically based on heuristics, such as the fact that data races can only occur in concurrent portions of the code.

Dynamic Analysis Tools

Dynamic analysis tools execute the program and monitor its execution for potential data races. They track the access of shared variables. Since dynamic analysis tools actually run the code, they are typically more accurate than static analysis tools. However, dynamic analysis tools will miss some data races because they only analyze a particular run of the code. Bugs that are not in the path of execution will not be caught.

Dynamic analysis tools use the happens-before relationship. The happens-before relationship is a partial ordering on events in a multithreaded program. DJIT+ and DataCollider are two dynamic analysis tools that use the happens-before relationship to track the access history of shared variables and detect data races.

  • V. Kahlon, Y. Yang, S. Sankaranarayanan, and A. Gupta, “Fast and accurate static data-race detection for concurrent programs,” in Computer Aided Verification: 19th International Conference, CAV 2007, Berlin, Germany, July 3-7, 2007. Proceedings 19, pp. 226-239, Springer, 2007.
  • E. Pozniansky and A. Schuster, “MultiRace: efficient on-the-fly data race detection in multithreaded C++ programs,” Concurrency and Computation: Practice and Experience, vol. 19, no. 3, pp. 327-340, 2007.
  • J. Erickson, M. Musuvathi, S. Burckhardt, and K. Olynyk, “Effective Data-Race Detection for the Kernel,” in 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10), 2010.
  • J. Erickson, et al., “Dynamic Analyses for Data Race Detection,” Slideshow, Microsoft.
  • Y. Wang, “Race Detection in Parallel Programming,” Computer Science Department, San Jose State University, [Online]
  • Nathan’s Blog, “Finding Races in Firefox with ThreadSanitizer,” Mozilla Blog, Feb. 20, 2015. [Online]. Available: https://blog.mozilla.org/nfroyd/2015/02/20/finding-races-in-firefox-with-threadsanitizer/
  • Y. Yong-chul and A. Gangopadhyay, “What Are Data Races? How to Avoid Them During Software Development,” MathWorks, [Online]. Available: https://www.mathworks.com/products/polyspace/static-analysis-notes/what-data-races-how-avoid-during-software-development.html

You May Also Enjoy

Training data for my second marathon.

7 minute read

In this post, I will talk about my preparation for my second marathon and share some data on my training. I ran my first marathon in May 2023 with a time of...

WhatsApp Chat Analysis

1 minute read

In this post, I will analyze some WhatsApp chats of mine. I forked this repo and modified it a little bit. My fork is here.

Model Compression and MUNGE

2 minute read

In this post, I will summarize the paper titled Model Compression [1], and implement the synthetic data generation algorithm described in it. We know that e...

Data Analysis of EPL 2017-2018 Season

Have you ever placed a bet on a football match? If you have, there is a high probability that you have lost. This post will help you increase your chance of ...

Browser not supported

This probably isn't the experience you were expecting. Internet Explorer isn't supported on Uber.com. Try switching to a different browser to view our site.

Uber logo

Come reimagine with us

Dynamic Data Race Detection in Go Code

Featured image for Dynamic Data Race Detection in Go Code

Uber has extensively adopted Go as a primary programming language for developing microservices. Our Go monorepo consists of about 50 million lines of code and contains approximately 2,100 unique Go services. Go makes concurrency a first-class citizen; prefixing function calls with the g o keyword runs the call asynchronously. These asynchronous function calls in Go are called goroutines. Developers hide latency (e.g., IO or RPC calls to other services) by creating goroutines within a single running Go program. Goroutines are considered “lightweight”, and the Go runtime context switches them on the operating-system (OS) threads. Go programmers often use goroutines liberally. Two or more goroutines can communicate data either via message passing (channels) or shared memory. Shared memory happens to be the most commonly used means of data communication in Go. 

A data race occurs in Go when two or more goroutines access the same memory location, at least one of them is a write, and there is no ordering between them , as defined by the Go memory model. Outages caused by data races in Go programs are a recurring and painful problem in our microservices. These issues have brought down our critical, customer-facing services for hours in total, causing inconvenience to our customers and impacting our revenue. In this blog, we discuss deploying Go’s default dynamic race detector to continuously detect data races in our Go development environment. This deployment has enabled detection of more than 2,000 races resulting in ~1,000 data races fixed by more than two hundred engineers.

Dynamically Detecting Data Races

Dynamic race detection involves analyzing a program execution by instrumenting the shared memory accesses and synchronization constructs. The execution of unit tests in Go that spawn multiple goroutines is a good starting point for dynamic race detection. Go has a built-in race detector that can be used to instrument the code at compile time and detect the races during their execution. Internally, the Go race detector uses the ThreadSanitizer runtime library which uses a combination of lock-set and happens-before based algorithms to report races.

Important attributes associated with dynamic race detection are as follows: 

  • Dynamic race detection will not report all races in the source code, as it is dependent on the analyzed executions
  • The detected set of races are dependent on the thread interleavings and can vary across multiple runs, even though the input to the program remains unchanged

When to Deploy a Dynamic Data Race Detector? 

We use more than 100,000 Go unit tests in our repository to exercise the code and detect data races. However, we faced a challenging question on when to deploy the race detector. 

Running a dynamic data race detector at pull request (PR) time is fraught with the following problems:

  • The race detection is non-deterministic. Hence a race introduced by the PR may not be exposed and can go undetected. The consequence of this behavior is that a later benign PR may be affected by the dormant race being detected and get incorrectly blocked, thus impacting developer productivity. Further, the presence of pre-existing data races in our 50M code base makes this a non-starter.
  • Dynamic data race detectors have 2-20x space and 5-10x memory overheads , which can result in either violation of our SLAs or increased hardware costs.

Image

Based on these considerations, we decided to deploy the race detector periodically on a snapshot of code, post-facto, which involves the following steps:

 (a) Perform dynamic race detection by executing all the unit tests in the repository 

 (b) Report all outstanding races by filing tasks to the appropriate bug owner

A detected race report contains the following details:

  • The conflicting memory address
  • 2 call chains (a.k.a., calling contexts or stack traces) of the 2 conflicting accesses
  • The memory access types (read or a write) associated with each access

We handled a few hurdles in ensuring that duplicate races are not reported by performing a hash of the reported stack races, and applying heuristics to determine the possible developer who is responsible for fixing the bug. While we chose this deployment path, CI time deployment can be pursued if either detected races do not block the build and are used as warnings to inform the developer, or dynamic race detection is refined to make CI time deterministic detection feasible.

Impact of Our Deployment

We rolled this deployment out in April 2021 and collected data over a period of 6 months. Our approach has helped detect ∼2,000 data races in our monorepo with hundreds of daily commits by hundreds of Go developers. Out of the 2,000 reported races, 1,011 races were fixed by 210 different engineers. We observed that there were 790 unique patches to fix these races, suggesting the number of  unique root causes. We also collected the statistics for the total outstanding races over the 6+ month period and have reported this data below: 

Image

In the initial phase (2-3 months) of the rollout, we shepherded the assignees to fix the data races. The drop in the outstanding races is noticeable during this phase. Subsequently, as the shepherding was minimized, we noticed a gradual increase in the total outstanding races. The figure also shows the fluctuations in the outstanding count, which is due to fixes to races, the introduction of new races, enabling and disabling of tests by developers, and the underlying non-determinism of dynamic race detection. After reporting all the pre-existing races, we also observe that the workflow creates about 5 new race reports, on average, every day.

Image

In terms of the overhead of running our offline data race detector, we noticed that the 95th percentile of the running time of all tests without data race detection is 25 minutes, whereas it increases by 4 fold to about 100 minutes with data race enabled. In a survey taken by tens of engineers, roughly 6 months after rolling out the system, 52% of developers found the system to be useful, 40% were not involved with the system, and 8% did not find it useful. 

Looking Ahead

Our experiences with this deployment suggest the following advancements: 

  • There is a need for building dynamic race detectors that can be deployed during continuous integration (CI). This requires that the challenges due to non-determinism and overheads are effectively addressed by the new detectors. 
  • Until such time, designing algorithms to root cause and identify appropriate owners for detected data races can help in accelerating the repair of data races. 
  • We have identified underlying coding patterns pertaining to data races in Go (discussed in the second part of this blog series), and a subset of these races can potentially be caught by CI time static analysis checks. 
  • The set of detected races is dependent on the input test suite. Being able to run race detection on other kinds of tests (beyond unit tests) such as integration tests, end-to-end tests, blackbox tests, and even production traces can help detect more races. 
  • We also believe that program analysis tooling that fuzzes the schedules on the input test suite can expose thread interleavings that can enhance the set of detected races. 
  • Finally, the current approach is dependent on the availability of multithreaded executions via unit tests and all possible scenarios may not necessarily be incorporated while manually constructing such tests. Automatically generating multithreaded executions containing racy behavior and using the detector to validate the race can serve as an effective debugging tool. 

This is the first of a two-part blog post series on our experiences with data race in Go code. An elaborate version of our experiences will appear in the ACM SIGPLAN Programming Languages Design and Implementation (PLDI), 2022. In the second part of the blog series we discuss our learnings pertaining to the race patterns in Go. 

Murali Krishna Ramanathan

Murali Krishna Ramanathan

Murali Krishna Ramanathan is a Senior Staff Software Engineer and leads multiple code quality initiatives across Uber engineering. He is the architect of Piranha, a refactoring tool to automatically delete code due to stale feature flags. His interests are building tooling to address software development challenges with feature flagging, automated code refactoring and developer workflows, and automated test generation for improving software quality.

Milind Chabbi

Milind Chabbi

Milind Chabbi is a Staff Researcher in the Programming Systems Research team at Uber. He leads research initiatives across Uber in the areas of compiler optimizations, high-performance parallel computing, synchronization techniques, and performance analysis tools to make large, complex computing systems reliable and efficient.

Posted by Murali Krishna Ramanathan, Milind Chabbi

Related articles

Image

How Uber Serves Over 40 Million Reads Per Second from Online Storage Using an Integrated Cache

February 15 / Global

Image

DataCentral: Uber’s Big Data Observability and Chargeback Platform

February 1 / Global

Image

Stopping Uber Fraudsters Through Risk Challenges

January 25 / Global

Image

Cinnamon Auto-Tuner: Adaptive Concurrency in the Wild

December 7, 2023 / Global

Image

PID Controller for Cinnamon

November 1, 2023 / Global

Most popular

Post thumbnail

With meal planning, in-office meals are more enjoyable

Post thumbnail

Using Uber: your guide to the Pace RAP  Program

Resources for driving and delivering with Uber

Experiences and information for people on the move

Ordering meals for delivery is just the beginning with Uber Eats

Putting stores within reach of a world of customers

Transforming the way companies move and feed their people

Taking shipping logistics in a new direction

Moving care forward together with medical providers

Expanding the reach of public transportation

Explore how Uber employees from around the globe are helping us drive the world forward at work and beyond

Engineering

The technology behind Uber Engineering

Community support

Doing the right thing for cities and communities globally

Uber news and updates in your country

Product, how-to, and policy content—and more

Sign up to drive

Sign up to ride.

Data Race Detection and Data Race Patterns in Golang

Table of contents.

Golang’s Data Race

uber, an early adopter of the Go language, is also a “heavy user” of the Go technology stack. uber’s internal Go code repository has 5000w+ lines of Go code and 2100 Go-implemented standalone services, so the scale of Go applications is estimated to be among the top 3 in the world.

ber not only uses Go, but often exports their experiences and lessons learned from using Go.

The blogs of uber engineers are a vehicle for these high-quality Go articles, which are worth reading and experiencing again and again for gophers who want to “go deeper”.

The blog recently posted two articles about concurrent data races in Go, one on Dynamic Data Race Detection in Go Code and the other on Data Race Patterns in Go . These two articles also originated from the pre-print version of the paper “ A Study of Real-World Data Races in Golang ” published by uber engineers on arxiv.

Here’s a chat with you about these two condensed versions of blog posts, and hopefully we’ll all get something out of them.

1. Go’s built-in data race detector

We know: concurrent programs are bad to develop and even harder to debug. Concurrency is a breeding ground for problems, and even though Go has built-in concurrency and provides concurrency primitives (goroutine, channel, and select) based on the CSP concurrency model, it turns out that in the real world, Go programs do not cause fewer concurrency problems . “No silver bullet” once again !

But the Go core team has been aware of this for a long time and added the race detector to Go tools in Go 1.1. By adding -race to the execution of go tools commands, the detector can find places in a program where a potential concurrency error is raised by concurrent accesses to the same variable (at least one access is a write operation). the Go standard library was also a beneficiary of the introduction of the race detector. race detector has helped the Go standard library detect 42 data race problems.

race detector based on Google a team to develop tools Thread Sanitizer (TSan) (in addition to thread sanitizer, google has a bunch of sanitizer, such as: AddressSanitizer, LeakSanitizer, MemorySanitizer, etc.). The first version of TSan implementation was released in 2009, and the detection algorithm used “from” the old tool Valgrind.

After its release, TSan helped the Chromium browser team identify nearly 200 potential concurrency problems, but the first version of TSan had one of the biggest problems, and that was slow!

Because of the achievements, the development team decided to rewrite TSan, which led to v2. Compared to v1, v2 has several major changes.

  • Compile-time injection of code (Instrumentation).
  • Reimplementing runtime libraries and building them into compilers (LLVM and GCC).
  • In addition to data race detection, deadlock detection, lock release in locked state, etc. can be done.
  • About 20 times performance improvement in v2 compared to v1.
  • Support Go language.

So how exactly does TSan v2 work? Let’s move on to the next page.

2. How ThreadSanitizer v2 works

According to the description of the v2 algorithm on the Thread Sanitizer wiki , Thread Sanitizer is divided into two parts: the injection code and the runtime library .

1. Injecting code

The first part is to work with the compiler to inject code into the source code during the compilation phase. So what code to inject at what location ? As mentioned before, Thread Sanitizer tracks every memory access in the program, so TSan injects code at every memory access, except for the following cases of course.

  • Memory access without data races.

For example: read accesses to global constants, accesses in functions to memory that has been shown not to escape to the heap.

  • Redundant accesses: read operations that occur before writing to a memory location
  • … …

So what code is being injected? Here is an example of writing a memory operation inside the function foo.

writing a memory operation inside the function foo

We see that the __tsan_write4 function is injected before the write operation to address p. The entry and exit of function foo are injected with __tsan_func_entry and __tsan_func_exit , respectively. And for memory read operations that require code injection, the injected code is __tsan_read4 ; atomic memory operations are injected using __tsan_atomic for injection….

2. TSan Runtime Library

Once the code is injected at compile time and a Go program with TSan is built, it is the Tsan runtime library that plays the role of data race detection during the runtime of the Go program. How does TSan detect a data race?

TSan’s detection relies on a concept called Shadow Cell . What is a Shadow Cell? A Shadow Cell itself is an 8-byte memory cell that represents an event of a read/write operation to a memory address, i.e., each write or read operation to a memory block generates a Shadow Cell. obviously the Shadow Cell, as a recorder of memory read/write events, itself stores information related to this event, as follows.

Shadow Cell

We see that each Shadow Cell records the thread ID, the clock time, the location (offset) and length of the operation accessing the memory and the operation attribute of that memory access event (whether it is a write operation or not). For each application with 8 bytes of memory, TSan corresponds to a set (N) of Shadow Cells , as shown below.

Shadow Cell

The value of N can be 2, 4, or 8. The value of N directly affects the overhead caused by TSan and the “accuracy” of data race detection.

3. Detection algorithm

With code injection and Shadow Cell, which records memory access events, what logic does TSan use to detect data races? Let’s take a look at how the detection algorithm works with the example given by Google god Dmitry Vyukov in one of his speaks .

Let’s take N=8 as an example (i.e., 8 Shadow Cells are used to track and verify an application’s 8-byte memory block). The following is the initial situation, assuming that there are no read or write operations on the 8-byte application memory block at this time.

Detection algorithm

Now, a thread T1 performs a write operation to the first two bytes of that block of memory, and the write operation generates the first Shadow Cell, as shown below.

Shadow Cell

The Pos field describes the starting offset and length of the 8-byte memory cell accessed by the write/read operation, for example, 0:2 here means the starting byte is the first byte and the length is 2 bytes. At this point, the Shadow Cell window has only one Shadow Cell, and there is no possibility of race.

Next, a thread T2 performs another read operation for the last four bytes of the block of memory, and the read operation generates a second Shadow Cell, as shown in the figure below.

Shadow Cell

The bytes involved in this read operation do not intersect with the first Shadow Cell, and there is no possibility of a data race.

Next, a thread T3 performs a write operation for the first four bytes of the block of memory, and the write operation generates a third Shadow Cell, as shown in the figure below.

Shadow Cell

We see that the two threads T1 and T3 have overlapping areas of access to that memory block, and T1 is a write operation, so there is a possibility of data race in this case. TSan’s race detection algorithm is essentially a state machine that is walked through every time a memory access occurs. The logic of the state machine is simple: it goes through all the cells in the Shadow Cell window corresponding to this block of memory, and compares the latest cell with the existing cells one by one, and gives a warning if there is a race.

In this example, T1’s write and T3’s read areas overlap, and if Shallow Cell1’s clock E1 doesn’t have a hapens-before Shadow Cell’s clock E3, then there is a data race. hapens-before is determined by how we can find out from tsan’s implementation We can find out from tsan’s implementation.

In this example, the number of Shadow Cells in a set corresponding to an 8-byte application memory is N=8. However, memory access is a high-frequency event, so soon the Shadow Cell window will be written full, so where is the new Shadow Cell stored? In this case, the TSan algorithm will randomly delete an old Shadow Cell and write the new Shadow Cell. This also confirms what was mentioned earlier: the selection of the N value will affect the detection accuracy of TSan to some extent.

Well, after the initial understanding of the detection principle of TSan v2, let’s go back to uber’s article to see when uber deployed the race detection.

3. When to deploy a dynamic Go data race detector

As we can see from the previous brief description of TSan’s principle, the data race detection brought about by -race has a significant impact on the performance and overhead of the program.

The official Go document “Data Race Detector” shows that Go programs built with -race have 5-10 times more memory overhead and 2-20 times more execution time compared to Go programs built normally. However, we know that the race detector can only detect data contention problems when the program is running. Therefore, Gopher is cautious about using -race, especially in production environments. The 2013 article “Introducing the go race detector”, co-authored by Dmitry Vyukov and Andrew Gerrand, also states outright that it is not practical to keep the race detector in a production environment is impractical. They recommend two times to use race detector: one is to turn on race detector during test execution, especially in integration and stress testing scenarios; the other is to turn on race detector in a production environment, but only one service instance with race detector among many service instances But how much traffic hits this instance is up to you ^_^.

So, how does uber do it internally? As mentioned earlier: uber has a single internal repository containing 5000w+ lines of code, and 10w+ unit test cases in this repository. uber encountered two problems with the timing of deploying the race detector.

  • The uncertainty of the -race detection results makes race detection for each pr ineffective.

For example: a certain pr has a data race, but it is not detected when the race detector is executed; the later pr without data race may be detected by the data race in the previous pr when the race detection is executed, which may affect the smooth merging of that pr and the efficiency of the developers concerned.

At the same time, it is impossible to find out all the data race cases in the existing 5000w+ code.

  • The overhead of ace detector affects the SLA (I understand that uber’s internal CI pipeline also has a time SLA (a promise to developers) and each PR runs race detect, which may not run on time) and boosts hardware costs.

The deployment strategy for these two problems is to “test after the fact”, i.e. every once in a while, take a snapshot of the code repository and run through all the unit test cases with -race on. Okay, that doesn’t seem like anything new. Many companies probably do it this way.

When a data race problem is found, a report is sent to the appropriate developer. This piece of uber engineers did some work to find out the author most likely to introduce the bug by the information of data race detection results and send the report to him.

However, there is a data worthy of your reference: without data race detection, the p95 digit time to run through all unit tests internally at uber is 25 minutes, while with data race enabled, this time increases 4 times to about 100 minutes.

The above experiment implemented by uber engineers in mid-2021, during which they found the main code patterns that generate data race, and subsequently they may produce static code analysis tools for these patterns to help developers catch data race issues in their code earlier and more effectively. Next, let’s take a look at these code patterns.

4. What are the common data race patterns

uber engineers summarized 7 types of data race patterns , let’s look at them one by one.

1. Closures

The Go language natively provides support for closures. In Go, a closure is a function literal . A closure can refer to variables defined in its wrapping function. These variables are then shared between the wrapping function and the function literal, and these variables continue to exist as long as they can be accessed.

But I don’t know if you realize that Go closures capture variables in their wrapped functions by reference. Unlike languages like C++, you can choose whether to capture by value or by reference. Capture by reference means that once the closure is executed in a new goroutine, there is a high probability of a data race between the two goroutines for access to the captured variables. The “unfortunate” thing is that in Go closures are often used as the execution function of a goroutine.

The uber article gives three examples of data race patterns that result from this undifferentiated capture of variables by reference.

Example 1

In this first example, each loop creates a new goroutine based on a closure function. each of these goroutines captures the outside loop variable job, which establishes a race to job between multiple goroutines.

Example 2

The combination of the closure and variable declaration scopes in Example 2 creates a new goroutine in which the err variable is the return value of the external Foo function err. This causes the err value to become the “focus” of the race between the two goroutines.

Example 3

In Example 3, the named return value variable result is caught by the closure of the function executed as a new goroutine, resulting in a data race between the two goroutines on the variable result.

Slices are Go’s built-in composite data type. Compared to traditional arrays, slices have the ability to be dynamically expanded and passed as “slice descriptors” with low and fixed overhead, which makes them widely used in Go. However, while flexible, slices are also one of the most “dodgy” data types in Go, so it is important to be careful when using slices, as mistakes can be made if you are not careful.

Here is an example of forming a data race on a sliced variable.

an example of forming a data race on a sliced variable

From this code, it appears that although the developer did synchronize the captured slice variable myResults via mutex, it was protected by not using mutex when passing in the slice when creating a new goroutine later. There seems to be a problem with the example code though, the incoming myResults does not seem to be used additionally.

map is the other most commonly used built-in complex data type in go, and is probably second only to slicing in terms of the problems caused by map for go beginners. go map is not goroutine-safe, and go prohibits concurrent reads and writes to map variables. However, because it is a built-in hash table type, map is very widely used in go programming.

a concurrent read/write map example

The above example is a concurrent read/write map example, but unlike slice, go has built-in detection of concurrent reads and writes in the map implementation, and even if you don’t add -race, a panic will be thrown once a data race is found.

4. Mistakenly passed values cause trouble

Go recommends using pass-value semantics because it simplifies escape analysis and gives variables a better chance of being assigned to the stack, thus reducing GC pressure. However, there are some types that cannot be passed by passing values, such as sync.Mutex in the example below.

sync.Mutex

Mutex is a zero-valued available type, we don’t need to do any initial assignment to use Mutex instance. But Mutex type has internal state.

sync.Mutex

Passing the value would result in a copy of the state and lose the usefulness of synchronizing data access across multiple goroutines, as in the Mutex type variable m in the example above.

5. Misuse of messaging (channel) and shared memory

Go uses the CSP concurrency model, and the channel type acts as the communication mechanism between goroutines. Although the CSP concurrency model is more advanced than shared memory, from a practical point of view, it is very easy to make mistakes when using channels without a good understanding of the CSP model.

Misuse of messaging (channel) and shared memory

The problem in this example is that the goroutine started by the Start function may block on the f.ch send operation. Because once ctx cancels, Wait will exit and no goroutine will be blocking on f.ch. This will cause the new goroutine started by Start function to block on the line “f.ch <- 1”.

As you can see, problems like this are very subtle and difficult to identify with the naked eye without careful analysis.

6. sync.WaitGroup misuse causes data race problem

sync.WaitGroup is a mechanism commonly used by Go concurrent programs to wait for a group of goroutines to exit. It implements internal count adjustment through the Add and Done methods. And the Wait method is used to wait until the internal counter is 0 before returning. However, misuse of WaitGroup as in the example below can lead to data race problems.

sync.WaitGroup misuse causes data race problem

We see that the code in the example places wg.Add(1) in the goroutine execution function, instead of placing Add(1) before the goroutine is created and started, as is the correct method, which results in a data race to the WaitGroup internal counter, likely due to goroutine scheduling problems, making Add(1) may be called before the goroutine has time, resulting in an early Wait return.

The following example is a data race between two goroutines on the locationErr variable due to the order of execution of the defer function when the function returns.

a data race between two goroutines

When the main goroutine is determining whether locationErr is nil, doCleanup in another goroutine may or may not be executed.

7. Parallel table-driven tests may trigger a data race

Go has a built-in single test framework and supports parallel testing (testing.T.Parallel()). However, if you use parallel testing, it is extremely easy to cause data race problems. The original article doesn’t give an example, so you can experience it yourself.

About the code pattern of data race, before uber released these two articles, there are also some materials to classify the code pattern of data race problem, such as the following two resources, you can refer to them.

  • 《Data Race Detector》- https://go.dev/doc/articles/race_detector
  • 《ThreadSanitizer Popular Data Races》- https://github.com/google/sanitizers/wiki/ThreadSanitizerPopularDataRaces

In the just released Go 1.19beta1, the latest -race has been upgraded to TSan v3. The performance of race detection will be improved by 1.5x-2x compared to the previous version, memory overhead is halved, and there is no upper limit on the number of goroutines.

Note: To use -race in Golang, CGO must be enabled.

IMAGES

  1. Dynamic Data Race Detection in Go Code

    assignment detected data race

  2. PPT

    assignment detected data race

  3. PPT

    assignment detected data race

  4. PPT

    assignment detected data race

  5. Thread safety & data race trong 10 phút

    assignment detected data race

  6. PPT

    assignment detected data race

VIDEO

  1. Data Analytics Assignment Description

  2. Data Analytics Assignment Support Session

  3. Data Mining Week 3 Assignment 3 Solution || 2024

  4. Data 605 Individual Assignment

  5. Online data is detected but i've managed him from the game

  6. Part-8:Data Issues-Data Loss,Data Location,Data Lock-in,Data Segregation,Integrity,Performance Issue

COMMENTS

  1. Android

    What evidence do you have that this is coming from your app? Bear in mind that ACTION_IMAGE_CAPTURE might start any of hundreds of camera apps, depending on the user's device and what they have installed. Some of those will have bugs, and there is very little that you can do about that.

  2. Google ThreadSanitizer -- 排查多线程问题data race的大杀器

    Data race (译作数据竞争,或数据争用)是在设计多线程程序时可能遇到的问题。 Data race发生的条件是: 在同一个进程中有两个或更多的线程并发的访问同一内存地址。 至少有一个线程的访问时写(write)操作。 这些线程没有使用合适的同步机制来控制对这个内存地址的访问。 当上述3个条件同时满足时,对这个内存地址的访问顺序是不固定的,所以可能产生不符合预期的行为。 因为这样的随机性和不可预期性,data race可能在整个产品的测试周期中都没有被触发,但是交付上线之后可能导致难以解释的奇怪行为。 比如下面的这个函数,在单线程的情况下是没有问题的,但是在多线程的环境下调用存在data race的风险: 那要如何解决data race这个问题呢?

  3. Data races in Go(Golang) and how to fix them

    Detecting a Data Race The code we went through is a highly simplified example of a data race in action. In larger applications, a data race is much harder to detect on your own. Fortunately for us, Go (as of v1.1) has an inbuilt data race detector that we can use to pin point potential data race conditions.

  4. Sigbart code -6 in tid 21675 (RenderThread)

    By design, live data observers run on the UI thread but the samples never do any long running operation on the UI thread (unless there is a bug in the sample app). If you can share the anr file, maybe it can explain what is going on. Of course, it might also be in some code you've added so anr is necessary to get to the actual problem.

  5. What Are Data Races and How to Avoid Them During Software ...

    Data races are a common problem in multithreaded programming. Data races occur when multiple tasks or threads access a shared resource without sufficient protections, leading to undefined or unpredictable behavior.

  6. Avoiding Data Races in Multithreaded Programming

    GitHub LinkedIn Google Scholar Twitter e-Mail Avoiding Data Races in Multithreaded Programming In Spring 2022, one of the courses I took was ECE 566 - Parallel Processing. As part of the course, we were tasked with presenting a topic of our choice. Thomas and I chose to present on data races and data race detection methods.

  7. PDF Eraser: A Dynamic Data Race Detector for Multithreaded Programs

    1 st, no race detected. If this mod is 1 , race detected and output. Potential for race regardless. Eraser Implementation ... Undergraduate Assignments Experience •10% data races detected across runnable assignments. •False Positives (1) •Locked Head and Tail in Queue 21. Experiences

  8. PDF High-level Data Races

    example of a high-level data race, and can therefore be detected with the low-complexity algorithm presented ... each being an assignment of a value to a variable corresponding to a component on board the space craft. The values of variables are continuously read by sensors and recorded in a system state. A task running on board the

  9. PDF Kard: Lightweight Data Race Detection with Per-Thread Memory Protection

    Dynamic data race detectors find real-world data races with low false positives [58], but generally incur prohibitive performance overheads. For example, the state-of-the-art dynamic data race detector, Google's ThreadSanitizer (TSan) [52], employs compiler-based memory access instrumentation that slows down program execution by almost 7× ...

  10. PDF Fast and Accurate Static Data-Race Detection for Concurrent Programs

    The classical approach to data race detection involves three steps. The first and most critical step is the automatic discovery of shared variables, i.e., variables which can be accessed by two or more threads. Control locations where these shared variable are read or written determine potential locations where data races can arise.

  11. PDF Detecting Data Races in Parallel Program Executions

    a data race. As a result, is it not possible to precisely state which data races are detected, nor is the meaning of the reported data races always clear. Further-more, these methods can sometimes generate false data race reports. They can determine whether a data race was exhibited during an execution, but when

  12. Dynamic Data Race Detection in Go Code

    Dynamically Detecting Data Races. Dynamic race detection involves analyzing a program execution by instrumenting the shared memory accesses and synchronization constructs. The execution of unit tests in Go that spawn multiple goroutines is a good starting point for dynamic race detection. Go has a built-in race detector that can be used to ...

  13. PDF Eraser: A Dynamic Data Race Detector for Multithreaded Programs

    miss a data race. While there is a potential data race on the unprotected accesses to y, it will not be detected in the execution shown in the figure, because Thread 1 holds the lock before Thread 2, and so the accesses toy are ordered in this interleaving byhappens-before. A tool based on happens-

  14. PDF RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Tracking

    A data race occurs in a multithreaded program when two threads access the same memory location without any inter-vening synchronization operations, and at least one of the ... 256 bytes, and if potential races are detected the program can be re-run using a smaller granularity to get moreaccuracy[24]. O'CallahanandChoiprovideatwo-pass

  15. Data Race Detection and Data Race Patterns in Golang

    Data Race Detection and Data Race Patterns in Golang 2022-06-22 tutorials 3202 words 16 min read Table of Contents 1. Go's built-in data race detector 2. How ThreadSanitizer v2 works 1. Injecting code 2. TSan Runtime Library 3. Detection algorithm 3. When to deploy a dynamic Go data race detector 4. What are the common data race patterns 1.

  16. PDF Eraser : A dynamic data race detector for multithreaded programs By

    Complexity in Data Race Detection ! Data Race detection is a NP complete problem! For t threads of n instructions, the number of possible orders is about tn*t. ! A through detection will involve examining all the possible order to make sure there exist only one order.! Practical race detection tools are based on heuristics - so

  17. libutils/StrongPointer.cpp

    namespace android { void sp_report_race() { LOG_ALWAYS_FATAL("sp<> assignment detected data race"); } void sp_report_stack_pointer() { LOG_ALWAYS_FATAL("sp<> constructed with stack pointer argument"); } }

  18. PDF Detecting Data Races in Multi-Threaded Programs

    Anderson. Eraser: A Dynamic Data Race Detector for Multithreaded Programs. In ACM Transactions on Computer Systems, 15(4): pp. 391-411, 1997. E. Pozniansky and A. Schuster. Dynamic Data-Race Detection in Lock-Based Multi-Threaded Programs. In Principles and Practice of Parallel Programming, pp. 170-190, 2003. E. Pozniansky and A. Schuster.

  19. c

    As everyone seems not to say "There is no data race period, because x and y are atomic", I have to assume that x and y are non-atomic. In that case this code DOES have a data race and is Undefined Behavior period. It really doesn't matter (for the abstract machine of the memory model) that there are if statements here :/. -