Regular expression template using lookahead assertions in Python
up vote
-1
down vote
favorite
I built the following regular expression pattern using grouping and named groups after Landsat's new product identifier pattern. It looks like this (see also link to pythex.org):
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(?P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<acquisition_year>19|20dd)(?P<acquisition_month>0[1-9]|1[012])(?P<acquisition_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<processing_year>19|20dd)(?P<processing_month>0[1-9]|1[012])(?P<processing_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B[0-9Q][01A]?).TIF
Given a directory that contains, say, the following file(names):
LC08_L1TP_184033_20170328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B1.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B4.TIF
LC08_L1TP_184033_20173328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_ANG.txt
LC08_L1TP_184033_20181028_20181028_01_RT_B10.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B11.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B1.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B2.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B3.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B5.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B6.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B7.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B8.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B9.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_MTL.txt
LC08_L1TP_184033_20181128_20181027_01_RT_B1.TIF
LC08_L1TP_184033_20181128_20181027_01_RT_BQA.TIF
and a slightly modified template, i.e. the following:
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(?P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<acquisition_year>19|20\d\d)(?P<acquisition_month>0[1-9]|1[012])(?P<acquisition_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<processing_year>19|20\d\d)(?P<processing_month>0[1-9]|1[012])(?P<processing_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B{BAND}).TIF
the following Python function:
def retrieve_selected_bands(bands, scene):
"""
Retrieve user requested bands from a Landsat scene
Parameters
----------
bands :
User requested bands
scene :
Landsat scene directory
Returns
-------
Returns list of filenames of user requested bands
Example
-------
...
"""
requested_bands =
for band in bands:
for filename in os.listdir(scene):
template = regular_expression_variable.format(BAND=band)
pattern = re.compile(template)
if pattern.match(filename):
requested_bands.append(glob.glob(filename)[0])
print
print('n'.join(map(str, requested_bands)))
will retrieve successfully what is asked for, i.e.:
retrieve_selected_bands(bands, '.')
LC08_L1TP_184033_20181028_20181028_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20170328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B5.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B10.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B11.TIF
LC08_L1TP_184033_20181128_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_BQA.TIF
I want to understand if and how the regular expression template can be improved, i.e. become shorter and thus more readable, by using for example lookahead assertions, like:
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<year>19|20dd)(?P<month>0[1-9]|1[012])(?P<day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P=year)(?P=month)(?P=day)(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B[0-9Q][01A]?).TIF
However, the latter version of the template fails to capture all "valid" strings. If I am not wrong, it fails on the (?=day)
part.
How can the first template be improved? I.e., become shorter and still include all date patterns of valid Landsat product identifier strings?
python regex template
New contributor
add a comment |
up vote
-1
down vote
favorite
I built the following regular expression pattern using grouping and named groups after Landsat's new product identifier pattern. It looks like this (see also link to pythex.org):
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(?P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<acquisition_year>19|20dd)(?P<acquisition_month>0[1-9]|1[012])(?P<acquisition_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<processing_year>19|20dd)(?P<processing_month>0[1-9]|1[012])(?P<processing_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B[0-9Q][01A]?).TIF
Given a directory that contains, say, the following file(names):
LC08_L1TP_184033_20170328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B1.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B4.TIF
LC08_L1TP_184033_20173328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_ANG.txt
LC08_L1TP_184033_20181028_20181028_01_RT_B10.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B11.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B1.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B2.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B3.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B5.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B6.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B7.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B8.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B9.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_MTL.txt
LC08_L1TP_184033_20181128_20181027_01_RT_B1.TIF
LC08_L1TP_184033_20181128_20181027_01_RT_BQA.TIF
and a slightly modified template, i.e. the following:
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(?P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<acquisition_year>19|20\d\d)(?P<acquisition_month>0[1-9]|1[012])(?P<acquisition_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<processing_year>19|20\d\d)(?P<processing_month>0[1-9]|1[012])(?P<processing_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B{BAND}).TIF
the following Python function:
def retrieve_selected_bands(bands, scene):
"""
Retrieve user requested bands from a Landsat scene
Parameters
----------
bands :
User requested bands
scene :
Landsat scene directory
Returns
-------
Returns list of filenames of user requested bands
Example
-------
...
"""
requested_bands =
for band in bands:
for filename in os.listdir(scene):
template = regular_expression_variable.format(BAND=band)
pattern = re.compile(template)
if pattern.match(filename):
requested_bands.append(glob.glob(filename)[0])
print
print('n'.join(map(str, requested_bands)))
will retrieve successfully what is asked for, i.e.:
retrieve_selected_bands(bands, '.')
LC08_L1TP_184033_20181028_20181028_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20170328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B5.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B10.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B11.TIF
LC08_L1TP_184033_20181128_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_BQA.TIF
I want to understand if and how the regular expression template can be improved, i.e. become shorter and thus more readable, by using for example lookahead assertions, like:
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<year>19|20dd)(?P<month>0[1-9]|1[012])(?P<day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P=year)(?P=month)(?P=day)(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B[0-9Q][01A]?).TIF
However, the latter version of the template fails to capture all "valid" strings. If I am not wrong, it fails on the (?=day)
part.
How can the first template be improved? I.e., become shorter and still include all date patterns of valid Landsat product identifier strings?
python regex template
New contributor
Hey, welcome to Code Review! This question does not match what this site is about. Code Review is about improving existing, working code. Code Review is not the site to ask for help in fixing or changing what your code does. Once the code does what you want, we would love to help you do the same thing in a cleaner way! Please see our help center for more information.
– Graipher
16 hours ago
@Graipher, if I copy-paste the python code that uses the first pattern, which works fine, and ask again, the same question (which is: can I make this template shorter?--so as to save space, make it perhaps more readable, so improving), will you then accept this question as a valid one for Code Review?
– Nikos Alexandris
10 hours ago
If you only use the first one and show valid Python code that uses it (just wrap it in a function) and preferably add those test cases you mention, it would probably be on-topic.
– Graipher
10 hours ago
@Graipher I tried to improve the question. Is it valid for Code Review now?
– Nikos Alexandris
9 hours ago
add a comment |
up vote
-1
down vote
favorite
up vote
-1
down vote
favorite
I built the following regular expression pattern using grouping and named groups after Landsat's new product identifier pattern. It looks like this (see also link to pythex.org):
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(?P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<acquisition_year>19|20dd)(?P<acquisition_month>0[1-9]|1[012])(?P<acquisition_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<processing_year>19|20dd)(?P<processing_month>0[1-9]|1[012])(?P<processing_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B[0-9Q][01A]?).TIF
Given a directory that contains, say, the following file(names):
LC08_L1TP_184033_20170328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B1.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B4.TIF
LC08_L1TP_184033_20173328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_ANG.txt
LC08_L1TP_184033_20181028_20181028_01_RT_B10.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B11.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B1.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B2.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B3.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B5.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B6.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B7.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B8.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B9.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_MTL.txt
LC08_L1TP_184033_20181128_20181027_01_RT_B1.TIF
LC08_L1TP_184033_20181128_20181027_01_RT_BQA.TIF
and a slightly modified template, i.e. the following:
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(?P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<acquisition_year>19|20\d\d)(?P<acquisition_month>0[1-9]|1[012])(?P<acquisition_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<processing_year>19|20\d\d)(?P<processing_month>0[1-9]|1[012])(?P<processing_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B{BAND}).TIF
the following Python function:
def retrieve_selected_bands(bands, scene):
"""
Retrieve user requested bands from a Landsat scene
Parameters
----------
bands :
User requested bands
scene :
Landsat scene directory
Returns
-------
Returns list of filenames of user requested bands
Example
-------
...
"""
requested_bands =
for band in bands:
for filename in os.listdir(scene):
template = regular_expression_variable.format(BAND=band)
pattern = re.compile(template)
if pattern.match(filename):
requested_bands.append(glob.glob(filename)[0])
print
print('n'.join(map(str, requested_bands)))
will retrieve successfully what is asked for, i.e.:
retrieve_selected_bands(bands, '.')
LC08_L1TP_184033_20181028_20181028_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20170328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B5.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B10.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B11.TIF
LC08_L1TP_184033_20181128_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_BQA.TIF
I want to understand if and how the regular expression template can be improved, i.e. become shorter and thus more readable, by using for example lookahead assertions, like:
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<year>19|20dd)(?P<month>0[1-9]|1[012])(?P<day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P=year)(?P=month)(?P=day)(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B[0-9Q][01A]?).TIF
However, the latter version of the template fails to capture all "valid" strings. If I am not wrong, it fails on the (?=day)
part.
How can the first template be improved? I.e., become shorter and still include all date patterns of valid Landsat product identifier strings?
python regex template
New contributor
I built the following regular expression pattern using grouping and named groups after Landsat's new product identifier pattern. It looks like this (see also link to pythex.org):
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(?P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<acquisition_year>19|20dd)(?P<acquisition_month>0[1-9]|1[012])(?P<acquisition_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<processing_year>19|20dd)(?P<processing_month>0[1-9]|1[012])(?P<processing_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B[0-9Q][01A]?).TIF
Given a directory that contains, say, the following file(names):
LC08_L1TP_184033_20170328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B1.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B4.TIF
LC08_L1TP_184033_20173328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_ANG.txt
LC08_L1TP_184033_20181028_20181028_01_RT_B10.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B11.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B1.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B2.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B3.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B5.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B6.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B7.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B8.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B9.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_MTL.txt
LC08_L1TP_184033_20181128_20181027_01_RT_B1.TIF
LC08_L1TP_184033_20181128_20181027_01_RT_BQA.TIF
and a slightly modified template, i.e. the following:
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(?P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<acquisition_year>19|20\d\d)(?P<acquisition_month>0[1-9]|1[012])(?P<acquisition_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<processing_year>19|20\d\d)(?P<processing_month>0[1-9]|1[012])(?P<processing_day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B{BAND}).TIF
the following Python function:
def retrieve_selected_bands(bands, scene):
"""
Retrieve user requested bands from a Landsat scene
Parameters
----------
bands :
User requested bands
scene :
Landsat scene directory
Returns
-------
Returns list of filenames of user requested bands
Example
-------
...
"""
requested_bands =
for band in bands:
for filename in os.listdir(scene):
template = regular_expression_variable.format(BAND=band)
pattern = re.compile(template)
if pattern.match(filename):
requested_bands.append(glob.glob(filename)[0])
print
print('n'.join(map(str, requested_bands)))
will retrieve successfully what is asked for, i.e.:
retrieve_selected_bands(bands, '.')
LC08_L1TP_184033_20181028_20181028_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20170328_20161027_01_RT_B4.TIF
LC08_L1TP_184033_20171128_20181027_01_RT_B4.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B5.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B10.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_B11.TIF
LC08_L1TP_184033_20181128_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181027_01_RT_BQA.TIF
LC08_L1TP_184033_20181028_20181028_01_RT_BQA.TIF
I want to understand if and how the regular expression template can be improved, i.e. become shorter and thus more readable, by using for example lookahead assertions, like:
(?P<prefix>L)(?P<sensor>[C|O|T|E|M])(?P<satellite>0[14578])(P<delimiter>_)(?P<processing_correction_level>(L1(?:TP|GT|GS)))(?P=delimiter)(?P<path>[012][0-9][0-9])(?P<row>[01][0-9][0-9]|2[0-4][0-3])(?P=delimiter)(?P<year>19|20dd)(?P<month>0[1-9]|1[012])(?P<day>0[1-9]|[12][0-9]|3[01])(?P=delimiter)(?P=year)(?P=month)(?P=day)(?P=delimiter)(?P<collection>0[12])(?P=delimiter)(?P<category>RT|T[1|2])(?P=delimiter)(?P<band>B[0-9Q][01A]?).TIF
However, the latter version of the template fails to capture all "valid" strings. If I am not wrong, it fails on the (?=day)
part.
How can the first template be improved? I.e., become shorter and still include all date patterns of valid Landsat product identifier strings?
python regex template
python regex template
New contributor
New contributor
edited 9 hours ago
New contributor
asked 17 hours ago
Nikos Alexandris
1023
1023
New contributor
New contributor
Hey, welcome to Code Review! This question does not match what this site is about. Code Review is about improving existing, working code. Code Review is not the site to ask for help in fixing or changing what your code does. Once the code does what you want, we would love to help you do the same thing in a cleaner way! Please see our help center for more information.
– Graipher
16 hours ago
@Graipher, if I copy-paste the python code that uses the first pattern, which works fine, and ask again, the same question (which is: can I make this template shorter?--so as to save space, make it perhaps more readable, so improving), will you then accept this question as a valid one for Code Review?
– Nikos Alexandris
10 hours ago
If you only use the first one and show valid Python code that uses it (just wrap it in a function) and preferably add those test cases you mention, it would probably be on-topic.
– Graipher
10 hours ago
@Graipher I tried to improve the question. Is it valid for Code Review now?
– Nikos Alexandris
9 hours ago
add a comment |
Hey, welcome to Code Review! This question does not match what this site is about. Code Review is about improving existing, working code. Code Review is not the site to ask for help in fixing or changing what your code does. Once the code does what you want, we would love to help you do the same thing in a cleaner way! Please see our help center for more information.
– Graipher
16 hours ago
@Graipher, if I copy-paste the python code that uses the first pattern, which works fine, and ask again, the same question (which is: can I make this template shorter?--so as to save space, make it perhaps more readable, so improving), will you then accept this question as a valid one for Code Review?
– Nikos Alexandris
10 hours ago
If you only use the first one and show valid Python code that uses it (just wrap it in a function) and preferably add those test cases you mention, it would probably be on-topic.
– Graipher
10 hours ago
@Graipher I tried to improve the question. Is it valid for Code Review now?
– Nikos Alexandris
9 hours ago
Hey, welcome to Code Review! This question does not match what this site is about. Code Review is about improving existing, working code. Code Review is not the site to ask for help in fixing or changing what your code does. Once the code does what you want, we would love to help you do the same thing in a cleaner way! Please see our help center for more information.
– Graipher
16 hours ago
Hey, welcome to Code Review! This question does not match what this site is about. Code Review is about improving existing, working code. Code Review is not the site to ask for help in fixing or changing what your code does. Once the code does what you want, we would love to help you do the same thing in a cleaner way! Please see our help center for more information.
– Graipher
16 hours ago
@Graipher, if I copy-paste the python code that uses the first pattern, which works fine, and ask again, the same question (which is: can I make this template shorter?--so as to save space, make it perhaps more readable, so improving), will you then accept this question as a valid one for Code Review?
– Nikos Alexandris
10 hours ago
@Graipher, if I copy-paste the python code that uses the first pattern, which works fine, and ask again, the same question (which is: can I make this template shorter?--so as to save space, make it perhaps more readable, so improving), will you then accept this question as a valid one for Code Review?
– Nikos Alexandris
10 hours ago
If you only use the first one and show valid Python code that uses it (just wrap it in a function) and preferably add those test cases you mention, it would probably be on-topic.
– Graipher
10 hours ago
If you only use the first one and show valid Python code that uses it (just wrap it in a function) and preferably add those test cases you mention, it would probably be on-topic.
– Graipher
10 hours ago
@Graipher I tried to improve the question. Is it valid for Code Review now?
– Nikos Alexandris
9 hours ago
@Graipher I tried to improve the question. Is it valid for Code Review now?
– Nikos Alexandris
9 hours ago
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Nikos Alexandris is a new contributor. Be nice, and check out our Code of Conduct.
Nikos Alexandris is a new contributor. Be nice, and check out our Code of Conduct.
Nikos Alexandris is a new contributor. Be nice, and check out our Code of Conduct.
Nikos Alexandris is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209310%2fregular-expression-template-using-lookahead-assertions-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Hey, welcome to Code Review! This question does not match what this site is about. Code Review is about improving existing, working code. Code Review is not the site to ask for help in fixing or changing what your code does. Once the code does what you want, we would love to help you do the same thing in a cleaner way! Please see our help center for more information.
– Graipher
16 hours ago
@Graipher, if I copy-paste the python code that uses the first pattern, which works fine, and ask again, the same question (which is: can I make this template shorter?--so as to save space, make it perhaps more readable, so improving), will you then accept this question as a valid one for Code Review?
– Nikos Alexandris
10 hours ago
If you only use the first one and show valid Python code that uses it (just wrap it in a function) and preferably add those test cases you mention, it would probably be on-topic.
– Graipher
10 hours ago
@Graipher I tried to improve the question. Is it valid for Code Review now?
– Nikos Alexandris
9 hours ago