Seperate Names

The_Silencer

Developer
Jan 29, 2013
48
0
0
N/A
I am scrapping a site and the title has title="watch 1000 Ways to Die (2008) online">

Currently I am using title="watch (.+?) online"> in the add on gives me the name '1000 Ways to Die (2008)'

I would like to separate the 1000 Ways to die and the Year? I think that is whats giving me issues with using metahandler for Tv shows. So I would still define 'name' and also add 'year'


Any suggestions?
 

Bstrdsmkr

New member
Mar 16, 2012
763
0
0
tl;dr
Code:
title="watch (.+?)\s\(([\d]{4})\) online">
Starting with what you had:
Code:
title="watch (.+?) online">
We just need to indicate where the title stops and where the year starts. We know that the year is surrounded by parentheses and a space (\s) on each side. We also need to escape the parens so it knows they aren't part of an operation, but we're talking about actual parens:
Code:
title="watch (.+?)\s\(2008\) online">
Next, we generalize the year. The year is a set of digits only ([\d]) that always repeats exactly 4 times ({4}):
Code:
title="watch (.+?)\s\([\d]{4}\) online">
Finally, we surround the interesting part of the year with (not-escaped) parentheses to indicate that we want to return it too:
Code:
title="watch (.+?)\s\(([\d]{4})\) online">
 
Last edited:

jas0npc

Banned
May 5, 2012
2,449
0
0
UK
That near enough what i have apart from for 4 digits i had [0-9][0-9][0-9][0-9] So im off to alter all them to this:)
 

Bstrdsmkr

New member
Mar 16, 2012
763
0
0
FYI, you can also do ranges like [\d]{5-10} which would fit any series of numbers only, at least 5, but not more than 10 characters long
 

The_Silencer

Developer
Jan 29, 2013
48
0
0
N/A
Thank you very much, and for the great explanation. I am going to update my add on and I bet that will fix my metahandler issue for TV shows.