i [sic] strings like:
xx-xxx-xxxxx-xxx-xx th-asd-welcome-ruk-23 name-lastname-hello
i want match second set of characters delimited hyphen, example: xxx
, asd
, lastname
.
i tried:
^(?:[^-]+-){1}([^-]+)
, matchesxx-xxx
(?<=-)[^\-].+(?=-)
, matchesxxx-xxxxx-xxx
how match second octet?
regular expressions aren't right way go doing this. first i'd simplify task:
strings = %w[ xx-xxx-xxxxx-xxx-xx th-asd-welcome-ruk-23 name-lastname-hello ] strings.map { |s| s.split('-')[1] } # => ["xxx", "asd", "lastname"]
that splits string on hyphens arrays , returns second item found in each array.
then i'd use pattern in conjunction scan
find wanted:
strings.map { |s| s.scan(/[^-]+/)[1] } # => ["xxx", "asd", "lastname"]
that finds sub-strings not hyphens.
regular expressions great things, can increase problems maintenance, way beyond value. can slow code remarkably, because they're not smart, , adding intelligence requires lot of testing , knowledge of engine going do. so, go there carefully, , test/benchmark know if pattern, , path, fast. i've written lot of code, , mentored lot of developers, , found lot of places used pattern wrong, introduced hard detect bugs or slowed loops dramatically. i'm big believer in benchmarking , testing multiple paths.
for instance:
require 'fruity' strings = %w[ xx-xxx-xxxxx-xxx-xx th-asd-welcome-ruk-23 name-lastname-hello ] compare split { strings.map { |s| s.split('-')[1] } } scan { strings.map { |s| s.scan(/[^-]+/)[1] } } end # >> running each test 1024 times. test take 1 second. # >> split faster scan 4x ± 0.1
from experience knew split
run faster using scan
pattern.
from experience know patterns can extremely fast. testing variations of patterns , extracting records:
require 'fruity' strings = %w[ xx-xxx-xxxxx-xxx-xx th-asd-welcome-ruk-23 name-lastname-hello ] compare split { strings.map { |s| s.split('-')[1] } } scan { strings.map { |s| s.scan(/[^-]+/)[1] } } slice { strings.map { |s| s[/^(?:[^-]+)-([^-]+)/, 1] } } stribizhev { strings.map { |s| s.match(/(?<=-)[^-]+(?=-)/)[0] } } end # >> running each test 2048 times. test take 1 second. # >> slice faster split 60.00000000000001% ± 10.0% # >> split faster stribizhev 1.9x ± 0.1 # >> stribizhev faster scan 80.0% ± 10.0%
so, that should determine path follow. weigh maintenance cost: easier maintain s.split('-')[1]
or s[/^(?:[^-]+)-([^-]+)/, 1]
?
and, reason last pattern outruns simple split
extremely fast, because pattern anchored. anchoring patterns gives engine incredibly useful hint how locate pattern desired uses advantage. doesn't need backtracking, wastes engine's cpu time, , instead engine can continue looking forward find wants.