Regular Expressions

View: New views
4 Messages — Rating Filter:   Alert me  

Regular Expressions

by Shubha Vishwanath Karanth :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi R,

 

Again struck with regular expressions...

 

Suppose,

 

S=c("World_is_beautiful", "one_two_three_four","My_book")

 

I need to extract the last but one element of the strings. So, my output should look like:

Ans=c("is","three","My")

 

gsub() can do this...but wondering how do I give the regular expression....

 

 

 

Shubha Karanth | Amba Research

Ph +91 80 3980 8031 | Mob +91 94 4886 4510

Bangalore * Colombo * London * New York * San José * Singapore * www.ambaresearch.com

 

This e-mail may contain confidential and/or privileged i...{{dropped:13}}


______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: Regular Expressions

by Dimitris Rizopoulos :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

try this:

S <- c("World_is_beautiful", "one_two_three_four","My_book")

sapply(strsplit(S, "_"), tail, n = 2)[1, ]
# or
sapply(strsplit(S, "_"), function(x) x[length(x) - 1])


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message -----
From: "Shubha Vishwanath Karanth" <shubhak@...>
To: <r-help@...>
Sent: Tuesday, May 13, 2008 11:02 AM
Subject: [R] Regular Expressions


Hi R,



Again struck with regular expressions...



Suppose,



S=c("World_is_beautiful", "one_two_three_four","My_book")



I need to extract the last but one element of the strings. So, my
output should look like:

Ans=c("is","three","My")



gsub() can do this...but wondering how do I give the regular
expression....







Shubha Karanth | Amba Research

Ph +91 80 3980 8031 | Mob +91 94 4886 4510

Bangalore * Colombo * London * New York * San José * Singapore *
www.ambaresearch.com



This e-mail may contain confidential and/or privileged
i...{{dropped:13}}




--------------------------------------------------------------------------------


> ______________________________________________
> R-help@... mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Parent Message unknown Re: Regular Expressions

by Richard Cotton :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> S=c("World_is_beautiful", "one_two_three_four","My_book")

> I need to extract the last but one element of the strings. So, my
> output should look like:
 
> Ans=c("is","three","My")

> gsub() can do this...but wondering how do I give the regular
expression....

sapply(strsplit(S, "_"), function(x) x[length(x)-1])

You could use regular expressions, but I think it would only be
complicating things.

Regards,
Richie.

Mathematical Sciences Unit
HSL


------------------------------------------------------------------------
ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Regards,
Richie.

Mathematical Sciences Unit
HSL

Re: Regular Expressions

by Gabor Grothendieck :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, May 13, 2008 at 5:02 AM, Shubha Vishwanath Karanth
<shubhak@...> wrote:

> Suppose,
>
> S=c("World_is_beautiful", "one_two_three_four","My_book")
>
> I need to extract the last but one element of the strings. So, my output should look like:
>
> Ans=c("is","three","My")
>
> gsub() can do this...but wondering how do I give the regular expression....
>

As others have mentioned strsplit is probably easier in this case but it can
be done with a regular expression as shown below where [^_]+ matches a
any string of characters not containing _ :

> re <- "^([^_]+_)*([^_]+)_([^_]+)$"
> gsub(re, "\\2", S)
[1] "is"    "three" "My"

The strapply function in the gsubfn package can also be used.
out below has the same value as strsplit(S, "_"):

library(gsubfn)
out <- strapply(S, "[^_]+")
sapply(out, function(x) tail(x, 2)[1])

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.