c# - LINQ group collection by an arbitrary lattice -
apologies if i'm missing basic.
for given lattice array in lattice values represent minimum bucket, best way group array of values.
e.g.
double[] lattice = { 2.3, 2.8, 4.1, 4.7 }; double[] values = { 2.35, 2.4, 2.6, 3, 3.8, 4.5, 5.0, 8.1 }; groupbylattice(values, lattice);
such groupbylattice returns igroupings like:
2.3 : { 2.35, 2.4, 2.6 } 2.8 : { 3, 3.8 } 4.1 : { 4.5 } 4.7 : { 5.0, 8.1 }
edit:
i'm green enough linq queries best can with:
values.groupby( curr => lattice.first( lat => curr > lat) )
issues this:
- everything ends in first bucket - can understand why (of course first bucket satisfies case each after) i'm having hard time wrapping head around these in-place operations predicate want.
- i suspect having linq query inside of linq query not performant
post-mortem solution , results:
dmitry bychenko provided great answer, wanted provide followup may come across answer in future. had been trying solve: how can simplify huge dataset plotting?
for starters, first attempt pretty close. lattice being ordered needed change .first( ... )
.last( ... )
i.e.
values.groupby( curr => lattice.last( lat => curr > lat) )
that's , good, curious how better dmitry's solution perform. tested random set of 10000 doubles, lattice @ 0.25 spacing. (i pulled out .select(...)
transform dmitry's solution keep fair)
the average of 20 runs spit out result:
mine: 602ms dmitrys: 3ms
uh ... wow! that's 200x increase in speed. 200x! had run few times , inspect in debugger linq statement evaluating before timestamp (trusty .toarray()
rescue). i'm going now, who's looking accomplish same task should use methodology
providing lattice
sorted (it's easy sort array array.sort(lattice)
) can use array.binarysearch
:
double[] lattice = { 2.3, 2.8, 4.1, 4.7 }; double[] values = { 2.35, 2.4, 2.6, 3, 3.8, 4.5, 5.0, 8.1 }; var result = values .groupby(item => { int index = array.binarysearch(lattice, item); return index >= 0 ? lattice[index] : lattice[~index - 1]; }) .select(chunk => string.format("{0} : [{1}]", chunk.key, string.join(", ", chunk)));
test
console.write(string.join(environment.newline, result));
outcome
2.3 : [2.35, 2.4, 2.6] 2.8 : [3, 3.8] 4.1 : [4.5] 4.7 : [5, 8.1]
Comments
Post a Comment