c - How can I make code that concurrently reads and modifies an array well-defined without introducing locking? -


i'm writing program computes endgame tablebase chess variant. algorithm filling tablebase works this:

  1. start huge array of unsigned char, each member representing 1 position (we assume it's white's turn). array member if position lost, odd if won, 0xff if invalid, 0xfe if draw.
  2. iterate on array, marking each illegal position 0xff, each position white mated 0x00 , other positions 0x0fe.
  3. iterate on array, considering positions marked 0xfe. check if there move leads position array member even, if there one, write 1 plus number of position member corresponding curren position. if moves lead positions indicated odd numbers (i.e. loosing position), mark position 1 plus highest of these numbers indicate how long strongest defence takes.
  4. repeat step 3 until array doesn't change anymore.

for speed, i'd parallelize step three. careful reading yields in each iteration, ever write 1 value (the number of iteration) array. following strategy obtains:

  • split array n parts, let 1 thread work on each part. let current iteration i.
  • if thread has read member array , equal i, treat if set 0xfe because means member written concurrently thread.

now there race condition in program doesn't matter right result if there aren't pink elephants (which can't exist if char written atomically). however, since on paper there race condition, c compiler may declare program undefined , format hard disk.

what can parallelize algorithm without violating constraints of c memory model , without causing massive slowdown (e.g. adding locks)?

simplified problem description

here simplified algorithm demonstrates same concept stripped of unimportant stuff:

  1. start out array unsigned char a[n]. each array member either 0 or 1.
  2. for each array member set 0: if other array members equal 0 or 2, set array member 2. array members checked depend on index of array member want update. there no simple relationship between index , array members need check, it's random.

since ever change 0 2, doesn't matter in order process array entries, though there technically race condition if in parallel (since concurrently read/write same object). how can tell compiler shouldn't care race condition without sacrificing performance?

this _atomic type qualifier in c11. declare array as

_atomic unsigned char a[n]; 

which means each element of array can read or written atomically.

prior c11, there's no standard way this, generally, depending on implementation, datatypes atomic reads , writes. know are, you'll have @ documentation implementation using.


note default memory ordering c11 _atomic accesses memory_order_seq_cst (sequential consistency), , if don't need that, can use atomic_load_explicit , atomic_store_explicit actions weaker memory ordering (ie memory_order_relaxed in example)


Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -