html - How to find and replace 6 digit numbers within HREF links from map of values across site files, ideally using SED/Python -


i need create bash script, ideally using sed find , replace value lists in href url link constructs html sit files, looking-up in map (old new values), have given url construct. there around 25k site files through, , map has around 6,000 entries have search through.

all old , new values have 6 digits.

the url construct is:

one value:
href=".*jsp\?.*n=[0-9]{1,}.*"

list of values:
href=".*\.jsp\?.*n=[0-9]{1,}+n=[0-9]{1,}+n=[0-9]{1,}...*"

the list of values delimited + plus symbol, , list can 1 n values in length.

i want ignore construct such this:

href=".*\.jsp\?.*n=0.*"

ie list n=0

effectively i'm interested in url's include 1 or more values in file map, not prepended changed -- ie list requires updating.

please note: in above construct examples: .* means character isn't digit; i'm interested in 6 digit values in list of values after n=; i've trying isolate n= list rest of url construct, , should noted n= list can appear anywhere within url construct.

initially, want create script create report of links fulfills above criteria , have 6 digital old value that's in map file, file path, understanding of links impacted. eg:

 filename    link filea.jsp   /jsp/search/results.jsp?n=204200+731&ntx=mode+matchallpartial&ntk=gensearch&ntt= filea.jsp   /jsp/search/browse.jsp?ntx=mode+matchallpartial&n=213890+217867+731& fileb.jsp   /jsp/search/results.jsp?n=0+450+207827+213767&ntx=mode+matchallpartial&ntk=gensearch&ntt= 

lastly, i'd find , replace 6 digit numbers, within url construct lists, outlined above, efficiently possible (i'd reasonably fast there around 25k files, 6k values up, potentially multiple values in list).

**please note:** there additional issue have, when finding , replacing, old value have been assigned new value, that's been used, may have replaced.

e.g. if map file below:

 map-file.txt old     new 214865  218494 214866  217854 214867  214868 214868  218633 ...     ... 

and there href link such as:

/jsp/search/results.jsp?ntx=mode+matchallpartial&ntk=gensearch&n=0+450+214867+214868

214867 changes 214868 - need prepended flag value has been changed, , should not replaced, otherwise 214867 become 218633 214868 changed 218633. hope makes sense - need run through file , remove 6 digit numbers had been marked prepended flag, such link become:

/jsp/search/results.jsp?ntx=mode+matchallpartial&ntk=gensearch&n=0+450+214868changed+218633changed

unless there's better way manage these infile changes.

could please me on this, i'm note expert these kind of changes - massively appreciated.

many in advance,
alex

i write outline code in kind of pseudocode. , don't remember python write code in python.

first find type (if contains n=0 type 3, if contains "+" type 2, else type 1) , list of strings containing "n=..." exploding (name of php function) "+" sign.

the first loop on links. second loop each n= number. third loop looks in map file , finds replacing value. load data of map file variable before loops. file reading slowest operation have in programming. replace value in third loop, implode (php function) list of new strings new link when returning first loop.

probably have several files links need loop files.

when dealing repeated codes nees while loop until spare number found. , need save numbers used in list.


Comments

Popular posts from this blog

database - VFP Grid + SQL server 2008 - grid not showing correctly -

jquery - Set jPicker field to empty value -

.htaccess - htaccess convert request to clean url and add slash at the end of the url -