Alternate form of underscore?

Post a reply

Smilies
:D :) :( :o :-? 8) :lol: :x :P :oops: :cry: :evil: :roll: :wink:

BBCode is ON
[img] is ON
[url] is ON
Smilies are ON

Topic review
   

Expand view Topic review: Alternate form of underscore?

Re: Alternate form of underscore?

by MMFrLife » Fri Jan 06, 2017 4:38 am

Yeah, ZD taught me a little bit a while back, including escape concept, but I haven't really practiced or looked into things since.
I didn't know about the word character element "\w". That really opens things up. ZD pointed me to a site or two, but I've never seen
that one. It looks really great for learning, interacting. I'll definitely need to explore that. If I have any questions, I'll post to Off Topic under
something like "Regular Expressions". Be on the lookout if you're interested.

Thanks in bunches! :D

Re: Alternate form of underscore?

by rivorson » Thu Jan 05, 2017 10:18 pm

Seems like you have it perfectly.

http://regexr.com/ has a good reference and cheat sheet if you want to learn more. That's where I learned about it, but it is complicated and it took me several attempts just to get started.

Re: Alternate form of underscore?

by MMFrLife » Thu Jan 05, 2017 8:05 pm

So, I could really just do the one "\w" and add any characters after, taking care of both the escape and no escape situations?

like

Code: Select all

\w\-\.\[\
It appears to work when tested.

Re: Alternate form of underscore?

by MMFrLife » Thu Jan 05, 2017 8:02 pm

Ok. Thanks!
I will play around with it.

Re: Alternate form of underscore?

by rivorson » Thu Jan 05, 2017 7:48 pm

The escaped version would work but you don't need to add the extra \w. The dot is interpreted to mean any character in regex so it needs escaping.

The full regex for both hyphen and the dot would be:

Code: Select all

(\s(?:as?|and?|'n'|at|by|del?|des|du|el|feat\.?|for|from|in|into|las?|les?|los|of|on|or|pres\.|the|to|vs\.?)(?=\s))|(\b(?:III?|PM|SOS|UK|USA))\b|([\w-\.\xDF-\xF6\xF8-\xFF\u0100-\u024F\u0400-\u04FF])([\w'\xDF-\xF6\xF8-\xFF\u0100-\u024F\u0400-\u04FF]*)

Re: Alternate form of underscore?

by MMFrLife » Thu Jan 05, 2017 7:13 pm

Right, right. I was focusing too hard on the first section.

So, if I want to add more than one, this appears to be ok

Code: Select all

\w-\w.\
, but If I want to add multiple with escape,

Code: Select all

\w\-\w\.\
?

Re: Alternate form of underscore?

by rivorson » Thu Jan 05, 2017 6:45 pm

Both Capitalize presets have it. They're actually the same regex but the one with exceptions has some extra code added at the beginning.

This is the Capitalize with exceptions preset modified to not capitalize after a hyphen:

Code: Select all

(\s(?:as?|and?|'n'|at|by|del?|des|du|el|feat\.?|for|from|in|into|las?|les?|los|of|on|or|pres\.|the|to|vs\.?)(?=\s))|(\b(?:III?|PM|SOS|UK|USA))\b|([\w-\xDF-\xF6\xF8-\xFF\u0100-\u024F\u0400-\u04FF])([\w'\xDF-\xF6\xF8-\xFF\u0100-\u024F\u0400-\u04FF]*)

Re: Alternate form of underscore?

by MMFrLife » Thu Jan 05, 2017 6:08 pm

I'm using "....with exceptions" preset, as mentioned.
Only the regular "Capitalize" preset has a "\w" close to an apostrophe, and one later.

Re: Alternate form of underscore?

by rivorson » Thu Jan 05, 2017 3:41 pm

I've had a look at the regex on that preset. It will capitalize the first character in every group of 'word' characters. Regex defines a 'word' character as anything alphanumeric plus the underscore, so the underscore is the only character that will not cause the following letter to be capitalized in that preset by default.

You could add any character you want to the preset. Just add the character you want to use to the regex after the first "\w" (not the one that is immediately followed by an apostraphe). For example if you want to use the hyphen then that part would become "\w-". Some characters have special meaning in regex so to be safe you could escape it using a backslash, e.g. "\w\-".

Re: Alternate form of underscore?

by MMFrLife » Thu Jan 05, 2017 1:20 pm

According to wikipedia for "Underscore", such a thing does not exist.

1. There is a hypen and a minus (same thing, shortest length)
2. There are 4 dashes of varying lengths, the shortest of which is longer than a hyphen/minus (about the length of 2 hypens)
3. And a single underscore. The length is the shortest of the dashes but in the bottom position.

To make a long story short, its purpose is as a leading separator for Folder tag ID info. It needs to be a character that doesn't
get ignored and cause the second character (first letter) to be capitalized using "capitalize with exceptions" when running RFR add-on.
Periods and no space hyphens are ignored and IDs like "ep", ".lv", "-xb" get the first letters capitalized as such "-Xb".

The underscore does not get ignored (but goes unchanged), leaving what it detects as the second letter (actually the first) in lowercase.
That is a good thing. However, when used as a lead off next to an album or title, it's just a bit ugly/awkward looking. "Back in Black _lv"

Anyway, I think I'll just use that. It's looking better the more I look at it and a couple of spaces in between helps the looks.
It also works better when used in searching. The period is ignored and the hyphen negates the query, as expected

Re: Alternate form of underscore?

by rivorson » Wed Jan 04, 2017 2:19 pm

I don't think such a character exists. There's a longer hyphen but not a lower hyphen.

What are you trying to achieve?

Re: Alternate form of underscore?

by MMFrLife » Wed Jan 04, 2017 12:42 pm

Nice try!
Unfortunately, it provides the same length.

Looks like I'll have to bother ZD later on about a way around what I'm trying to do.

Thanks a mil. 8)

Re: Alternate form of underscore?

by dtsig » Wed Jan 04, 2017 11:43 am

I haven't tried it for a long time but ... if you look at a table for ascii ... look at the higher range of chars. they can be created by holding ALTand entering number on keypad

Alternate form of underscore?

by MMFrLife » Wed Jan 04, 2017 8:58 am

Does anyone know of a way to make an underscore that's like a hyphen (lengthwise) but in the underscore position?
Maybe a special character. I've never seen one, though.

Using a period instead will not work for what I need.

Thanks

Top