excel - On making MATCH function like FIND function -


i'm trying make match function work find function. first of all, generate dummy data use testing. here routine use:

sub data_generator() randomize dim data(1 100000, 1 1) dim p single = 1 100000     p = rnd()     if p < 0.4         data(i, 1) = "a"     elseif p >= 0.4 , p <= 0.7         data(i, 1) = "b"     else         data(i, 1) = "c"     end if next range("a1:a100000") = data end sub 

now, create sub-routine find string a in range data. there 2 methods use here employ match function. first method reset range of lookup array following code:

sub find_match_1() t0 = timer dim long, j long, k long, data range dim output(1 100000, 1 1) on error goto finish     set data = range(cells(j + 1, 1), "a100000")   'reset range of lookup array     = worksheetfunction.match("a", data, 0)     j = j +     output(j, 1) = j                               'label position of     k = k + 1                                      'counting number of [a] found loop finish:     range("b1:b100000") = output     inputbox "the number of [a] found " & k & " in", "process complete", timer - t0 end sub 

and second method, assign cell of range a located value vbnullstring instead of resetting range("a1:a100000"). idea delete string a after being found , expect match function find next string a in range("a1:a100000"). here code implement second method:

sub find_match_2() t0 = timer dim n long, long, j long dim data_store() dim output(1 100000, 1 1) data_store = range("a1:a100000") on error goto finish     j = worksheetfunction.match("a", range("a1:a100000"), 0)     output(j, 1) = j     cells(j, 1) = vbnullstring     n = n + 1 loop finish: range("a1:a100000") = data_store range("b1:b100000") = output inputbox "the number of [a] found  " & n & " in", "process complete", timer - t0 end sub 

the goal determine method better @ employing match function in performance. turns out first method completes less 0.4 seconds meanwhile second method completes minute on pc. questions are:

  1. why second method take time long complete?
  2. how 1 improve performance of second method?
  3. can match function used in array?

i agree more of code review question, chose own curiosity, i'll share found.

i think you're hitting classic case of n vs n^2 computational complexity. @ 2 methods, seem remarkably similar, , consider they're doing, keeping in mind match function linear search when use match_type=0 (because data unsorted, whereas other match types binary search on sorted data).

method 1:

  • start @ a1
  • continue down range until "a" found
  • restart @ cell below match

method 2:

  • start @ a1
  • continue down range until "a" found
  • clear "a"
  • restart @ a1

it should instantly apparent while 1 method continually shrinking range searches, other starting @ first cell , searching whole range. account of speedup, , boosts method 1 nice lead, it's not full story.

the real key lies in amount of work match has each situation. because range shrinks , moves start further down list, whichever cell method 1's match starts from, has search small number of cells before hits , resumes outer loop. meanwhile, method 2 continually destroying a's, making them less , less dense , forcing search more , more of range before getting hits. end, method 2 looping through 100,000 empty cells/b's/c's before finding next a.

so on average, match method 1 looking through couple of cells each time, while match method 2 taking longer , longer time goes on, until end when forced loop through entire range. on top of that, method 2 doing bunch of writes cell values, slower might think when have tens of thousands of times.

in honesty, best bet loop through cells once, looking a's , handling them go. match brings no advantage table, , method 1 more complicated version of loop described.

i'd write like:

sub find_match_3()     t0 = timer     dim k long, r range     dim output(1 100000, 1 1)     each r in range("a1:a100000").cells         if r.value = "a"             output(r.row, 1) = r.row        'label position of             k = k + 1                       'counting number of [a] found         end if     next      range("b1:b100000") = output     inputbox "the number of [a] found " & k & " in", "process complete", timer - t0 end sub 

which 30% faster on machine.


Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -