diff --git a/02_RProgramming/NOTUSED/grep/Regular Expressions - grep.pdf b/02_RProgramming/NOTUSED/grep/Regular Expressions - grep.pdf
deleted file mode 100644
index f711b11d..00000000
Binary files a/02_RProgramming/NOTUSED/grep/Regular Expressions - grep.pdf and /dev/null differ
diff --git a/02_RProgramming/NOTUSED/grep/index.Rmd b/02_RProgramming/NOTUSED/grep/index.Rmd
deleted file mode 100644
index bd5eb147..00000000
--- a/02_RProgramming/NOTUSED/grep/index.Rmd
+++ /dev/null
@@ -1,337 +0,0 @@
----
-title : Regular Expressions - grep
-subtitle : Computing for Data Analysis
-author : Roger Peng, Associate Professor
-job : Johns Hopkins Bloomberg School of Public Health
-logo : bloomberg_shield.png
-framework : io2012 # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js # {highlight.js, prettify, highlight}
-hitheme : tomorrow #
-url:
- lib: ../../libraries
- assets: ../../assets
-widgets : [mathjax] # {mathjax, quiz, bootstrap}
-mode : selfcontained # {standalone, draft}
----
-
-## Regular Expression Functions
-
-The primary R functions for dealing with regular expressions are
-- `grep`, `grepl`: Search for matches of a regular expression/pattern in a character vector; either return the indices into the character vector that match, the strings that happen to match, or a TRUE/FALSE vector indicating which elements match
-- `regexpr`, `gregexpr`: Search a character vector for regular expression matches and return the indices of the string where the match begins and the length of the match
-- `sub`, `gsub`: Search a character vector for regular expression matches and replace that match with another string
-- `regexec`: Easier to explain through demonstration.
-
----
-
-## grep
-
-Here is an excerpt of the Baltimore City homicides dataset:
-
-```r
-> homicides <- readLines("homicides.txt")
-> homicides[1]
-[1] "39.311024, -76.674227, iconHomicideShooting, ’p2’, ’
Harris was
-found dead July 22 and ruled a shooting victim; an autopsy
-subsequently showed that he had not been shot,...
’"
-```
-
----
-
-## grep
-
-By default, `grep` returns the indices into the character vector where the regex pattern matches.
-
-```r
-> grep("^New", state.name)
-[1] 29 30 31 32
-Setting value = TRUE returns the actual elements of the character vector that match. > grep("^New", state.name, value = TRUE)
-[1] "New Hampshire" "New Jersey" "New Mexico" "New York"
-grepl returns a logical vector indicating which element matches.
-> grepl("^New", state.name)
- [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALS
-[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALS
-[25] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALS
-[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALS
-[49] FALSE FALSE
-
-```
-
----
-
-## regexpr
-
-Some limitations of `grep`
-- The `grep` function tells you which strings in a character vector match a certain pattern but it doesn’t tell you exactly where the match occurs or what the match is (for a more complicated regex).
-- The `regexpr` function gives you the index into each string where the match begins and the length of the match for that string.
-- `regexpr` only gives you the first match of the string (reading left to right). `gregexpr` will give you all of the matches in a given string.
-
----
-
-## regexpr
-
-How can we find the date of the homicide?
-
-```r
-> homicides[1]
-[1] "39.311024, -76.674227, iconHomicideShooting, ’p2’, ’
Leon
-Nelson
3400 Clifton Ave. Baltimore,
-MD 21216
black male, 17 years old
-
Found on January 1, 2007
Victim died at Shock
-Trauma
Cause: shooting
’"
-```
-
-Can we just ’grep’ on “Found”?
-
----
-
-## regexpr
-
-The word ’found’ may be found elsewhere in the entry.
-
-```r
-> homicides[954]
-[1] "39.30677400000, -76.59891100000, icon_homicide_shooting, ’p816’,
-’
1400 N Caroline St Baltimore, MD 21213
-
Race: Black Gender: male Age: 29 years old
-
Found on March 3, 2010
Victim died at Scene
-
Cause: Shooting
Wheeler\\’s body
-was found on the grounds of Dr. Bernard Harris Sr. Elementary
-School
’"
-```
-
----
-
-## regexpr
-
-Let’s use the pattern
-'
"
-```
-
----
-
-## regexpr
-
-The previous pattern was too greedy and matched too much of the string. We need to use the ? metacharacter to make the regex “lazy”.
-
-```r
-> regexpr("
"
-```
-
----
-
-## regmatches
-
-One handy function is regmatches which extracts the matches in the strings for you without you having to use `substr`.
-
-```r
-> r <- regexpr("
"
-```
-
----
-
-## sub/gsub
-
-Sometimes we need to clean things up or modify strings by matching a pattern and replacing it with something else. For example, how can we extract the data from this string?
-
-```r
-> x <- substr(homicides[1], 177, 177 + 33 - 1)
-> x
-[1] "
Found on January 1, 2007
"
-```
-
-We want to strip out the stuff surrounding the “January 1, 2007” piece.
-
-```r
-> sub("
[F|f]ound on |
", "", x)
-[1] "January 1, 2007"
-> gsub("
[F|f]ound on |
", "", x)
-[1] "January 1, 2007"
-```
-
----
-
-## sub/gsub
-
-sub/gsub can take vector arguments
-
-```r
-> r <- regexpr("
[F|f]ound(.*?)
", homicides[1:5])
-> m <- regmatches(homicides[1:5], r)
->m
-[1] "
Found on January 1, 2007
" "
Found on January 2, 2007
"
-[3] "
Found on January 2, 2007
" "
Found on January 3, 2007
"
-[5] "
Found on January 5, 2007
"
-> gsub("
[F|f]ound on |
", "", m)
-[1] "January 1, 2007" "January 2, 2007" "January 2, 2007" "January 3, 2007"
-[5] "January 5, 2007"
-> as.Date(d, "%B %d, %Y")
-[1] "2007-01-01" "2007-01-02" "2007-01-02" "2007-01-03" "2007-01-05"
-```
-
----
-
-## regexec
-
-The `regexec` function works like regexpr except it gives you the indices for parenthesized sub-expressions.
-
-```r
-> regexec("
" "January 2, 2007"
-```
-
----
-
-## regexec
-
-Let’s make a plot of monthly homicide counts
-
-```r
-> r <- regexec("
[F|f]ound on (.*?)
", homicides)
-> m <- regmatches(homicides, r)
-> dates <- sapply(m, function(x) x[2])
-> dates <- as.Date(dates, "%B %d, %Y")
-> hist(dates, "month", freq = TRUE)
-```
-
----
-
-## regexec
-
-
-
----
-
-## Summary
-
-The primary R functions for dealing with regular expressions are
-- `grep`, `grepl`: Search for matches of a regular expression/pattern in a character vector
-- `regexpr`, `gregexpr`: Search a character vector for regular expression matches and return the indices where the match begins; useful in conjunction with `regmatches`
-- `sub`, `gsub`: Search a character vector for regular expression matches and replace that match with another string
-- `regexec`: Gives you indices of parethensized sub-expressions.
\ No newline at end of file
diff --git a/02_RProgramming/NOTUSED/grep/index.html b/02_RProgramming/NOTUSED/grep/index.html
deleted file mode 100644
index 2d0a8f7b..00000000
--- a/02_RProgramming/NOTUSED/grep/index.html
+++ /dev/null
@@ -1,494 +0,0 @@
-
-
-
- Regular Expressions - grep
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Regular Expressions - grep
-
Computing for Data Analysis
-
Roger Peng, Associate Professor Johns Hopkins Bloomberg School of Public Health
-
-
-
-
-
-
-
Regular Expression Functions
-
-
-
The primary R functions for dealing with regular expressions are
-
-
-
grep, grepl: Search for matches of a regular expression/pattern in a character vector; either return the indices into the character vector that match, the strings that happen to match, or a TRUE/FALSE vector indicating which elements match
-
regexpr, gregexpr: Search a character vector for regular expression matches and return the indices of the string where the match begins and the length of the match
-
sub, gsub: Search a character vector for regular expression matches and replace that match with another string
-
regexec: Easier to explain through demonstration.
-
-
-
-
-
-
-
-
-
grep
-
-
-
Here is an excerpt of the Baltimore City homicides dataset:
-
-
> homicides <- readLines("homicides.txt")
-> homicides[1]
-[1] "39.311024, -76.674227, iconHomicideShooting, ’p2’, ’<dl><dt>Leon
-Nelson</dt><dd class=\"address\">3400 Clifton Ave.<br />Baltimore, MD
-21216</dd><dd>black male, 17 years old</dd>
-<dd>Found on January 1, 2007</dd><dd>Victim died at Shock
-Trauma</dd><dd>Cause: shooting</dd></dl>’"
-
-> homicides[1000]
-[1] "39.33626300000, -76.55553990000, icon_homicide_shooting, ’p1200’,...
-
-
-
How can I find the records for all the victims of shootings (as opposed to other causes)?
> homicides[859]
-[1] "39.33743900000, -76.66316500000, icon_homicide_bluntforce,
-’p914’, ’<dl><dt><a href=\"http://essentials.baltimoresun.com/
-micro_sun/homicides/victim/914/steven-harris\">Steven Harris</a>
-</dt><dd class=\"address\">4200 Pimlico Road<br />Baltimore, MD 21215
-</dd><dd>Race: Black<br />Gender: male<br />Age: 38 years old</dd>
-<dd>Found on July 29, 2010</dd><dd>Victim died at Scene</dd>
-<dd>Cause: Blunt Force</dd><dd class=\"popup-note\"><p>Harris was
-found dead July 22 and ruled a shooting victim; an autopsy
-subsequently showed that he had not been shot,...</dd></dl>’"
-
-
-
-
-
-
-
-
-
grep
-
-
-
By default, grep returns the indices into the character vector where the regex pattern matches.
-
-
> grep("^New", state.name)
-[1] 29 30 31 32
-Setting value = TRUE returns the actual elements of the character vector that match. > grep("^New", state.name, value = TRUE)
-[1] "New Hampshire" "New Jersey" "New Mexico" "New York"
-grepl returns a logical vector indicating which element matches.
-> grepl("^New", state.name)
- [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALS
-[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALS
-[25] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALS
-[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALS
-[49] FALSE FALSE
-
-
-
-
-
-
-
-
-
-
regexpr
-
-
-
Some limitations of grep
-
-
-
The grep function tells you which strings in a character vector match a certain pattern but it doesn’t tell you exactly where the match occurs or what the match is (for a more complicated regex).
-
The regexpr function gives you the index into each string where the match begins and the length of the match for that string.
-
regexpr only gives you the first match of the string (reading left to right). gregexpr will give you all of the matches in a given string.
-
-
-
-
-
-
-
-
-
regexpr
-
-
-
How can we find the date of the homicide?
-
-
> homicides[1]
-[1] "39.311024, -76.674227, iconHomicideShooting, ’p2’, ’<dl><dt>Leon
-Nelson</dt><dd class=\"address\">3400 Clifton Ave.<br />Baltimore,
-MD 21216</dd><dd>black male, 17 years old</dd>
-<dd>Found on January 1, 2007</dd><dd>Victim died at Shock
-Trauma</dd><dd>Cause: shooting</dd></dl>’"
-
-
-
Can we just ’grep’ on “Found”?
-
-
-
-
-
-
-
-
regexpr
-
-
-
The word ’found’ may be found elsewhere in the entry.
-
-
> homicides[954]
-[1] "39.30677400000, -76.59891100000, icon_homicide_shooting, ’p816’,
-’<dl><dd class=\"address\">1400 N Caroline St<br />Baltimore, MD 21213</dd>
-<dd>Race: Black<br />Gender: male<br />Age: 29 years old</dd>
-<dd>Found on March 3, 2010</dd><dd>Victim died at Scene</dd>
-<dd>Cause: Shooting</dd><dd class=\"popup-note\"><p>Wheeler\\’s body
-was found on the grounds of Dr. Bernard Harris Sr. Elementary
-School</p></dd></dl>’"
-
One handy function is regmatches which extracts the matches in the strings for you without you having to use substr.
-
-
> r <- regexpr("<dd>[F|f]ound(.*?)</dd>", homicides[1:5])
-> regmatches(homicides[1:5], r)
-[1] "<dd>Found on January 1, 2007</dd>" "<dd>Found on January 2, 2007</dd>"
-[3] "<dd>Found on January 2, 2007</dd>" "<dd>Found on January 3, 2007</dd>"
-[5] "<dd>Found on January 5, 2007</dd>"
-
-
-
-
-
-
-
-
-
sub/gsub
-
-
-
Sometimes we need to clean things up or modify strings by matching a pattern and replacing it with something else. For example, how can we extract the data from this string?
-
-
> x <- substr(homicides[1], 177, 177 + 33 - 1)
-> x
-[1] "<dd>Found on January 1, 2007</dd>"
-
-
-
We want to strip out the stuff surrounding the “January 1, 2007” piece.
The primary R functions for dealing with regular expressions are
-
-
-
grep, grepl: Search for matches of a regular expression/pattern in a character vector
-
regexpr, gregexpr: Search a character vector for regular expression matches and return the indices where the match begins; useful in conjunction with regmatches
-
sub, gsub: Search a character vector for regular expression matches and replace that match with another string
-
regexec: Gives you indices of parethensized sub-expressions.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
\ No newline at end of file
diff --git a/02_RProgramming/NOTUSED/grep/index.md b/02_RProgramming/NOTUSED/grep/index.md
deleted file mode 100644
index a4da4e1f..00000000
--- a/02_RProgramming/NOTUSED/grep/index.md
+++ /dev/null
@@ -1,337 +0,0 @@
----
-title : Regular Expressions - grep
-subtitle : Computing for Data Analysis
-author : Roger Peng, Associate Professor
-job : Johns Hopkins Bloomberg School of Public Health
-logo : bloomberg_shield.png
-framework : io2012 # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js # {highlight.js, prettify, highlight}
-hitheme : tomorrow #
-url:
- lib: ../../libraries
- assets: ../../assets
-widgets : [mathjax] # {mathjax, quiz, bootstrap}
-mode : selfcontained # {standalone, draft}
----
-
-## Regular Expression Functions
-
-The primary R functions for dealing with regular expressions are
-- `grep`, `grepl`: Search for matches of a regular expression/pattern in a character vector; either return the indices into the character vector that match, the strings that happen to match, or a TRUE/FALSE vector indicating which elements match
-- `regexpr`, `gregexpr`: Search a character vector for regular expression matches and return the indices of the string where the match begins and the length of the match
-- `sub`, `gsub`: Search a character vector for regular expression matches and replace that match with another string
-- `regexec`: Easier to explain through demonstration.
-
----
-
-## grep
-
-Here is an excerpt of the Baltimore City homicides dataset:
-
-```r
-> homicides <- readLines("homicides.txt")
-> homicides[1]
-[1] "39.311024, -76.674227, iconHomicideShooting, ’p2’, ’
Harris was
-found dead July 22 and ruled a shooting victim; an autopsy
-subsequently showed that he had not been shot,...
’"
-```
-
----
-
-## grep
-
-By default, `grep` returns the indices into the character vector where the regex pattern matches.
-
-```r
-> grep("^New", state.name)
-[1] 29 30 31 32
-Setting value = TRUE returns the actual elements of the character vector that match. > grep("^New", state.name, value = TRUE)
-[1] "New Hampshire" "New Jersey" "New Mexico" "New York"
-grepl returns a logical vector indicating which element matches.
-> grepl("^New", state.name)
- [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALS
-[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALS
-[25] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALS
-[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALS
-[49] FALSE FALSE
-
-```
-
----
-
-## regexpr
-
-Some limitations of `grep`
-- The `grep` function tells you which strings in a character vector match a certain pattern but it doesn’t tell you exactly where the match occurs or what the match is (for a more complicated regex).
-- The `regexpr` function gives you the index into each string where the match begins and the length of the match for that string.
-- `regexpr` only gives you the first match of the string (reading left to right). `gregexpr` will give you all of the matches in a given string.
-
----
-
-## regexpr
-
-How can we find the date of the homicide?
-
-```r
-> homicides[1]
-[1] "39.311024, -76.674227, iconHomicideShooting, ’p2’, ’
Leon
-Nelson
3400 Clifton Ave. Baltimore,
-MD 21216
black male, 17 years old
-
Found on January 1, 2007
Victim died at Shock
-Trauma
Cause: shooting
’"
-```
-
-Can we just ’grep’ on “Found”?
-
----
-
-## regexpr
-
-The word ’found’ may be found elsewhere in the entry.
-
-```r
-> homicides[954]
-[1] "39.30677400000, -76.59891100000, icon_homicide_shooting, ’p816’,
-’
1400 N Caroline St Baltimore, MD 21213
-
Race: Black Gender: male Age: 29 years old
-
Found on March 3, 2010
Victim died at Scene
-
Cause: Shooting
Wheeler\\’s body
-was found on the grounds of Dr. Bernard Harris Sr. Elementary
-School
’"
-```
-
----
-
-## regexpr
-
-Let’s use the pattern
-'
"
-```
-
----
-
-## regexpr
-
-The previous pattern was too greedy and matched too much of the string. We need to use the ? metacharacter to make the regex “lazy”.
-
-```r
-> regexpr("
"
-```
-
----
-
-## regmatches
-
-One handy function is regmatches which extracts the matches in the strings for you without you having to use `substr`.
-
-```r
-> r <- regexpr("
"
-```
-
----
-
-## sub/gsub
-
-Sometimes we need to clean things up or modify strings by matching a pattern and replacing it with something else. For example, how can we extract the data from this string?
-
-```r
-> x <- substr(homicides[1], 177, 177 + 33 - 1)
-> x
-[1] "
Found on January 1, 2007
"
-```
-
-We want to strip out the stuff surrounding the “January 1, 2007” piece.
-
-```r
-> sub("
[F|f]ound on |
", "", x)
-[1] "January 1, 2007"
-> gsub("
[F|f]ound on |
", "", x)
-[1] "January 1, 2007"
-```
-
----
-
-## sub/gsub
-
-sub/gsub can take vector arguments
-
-```r
-> r <- regexpr("
[F|f]ound(.*?)
", homicides[1:5])
-> m <- regmatches(homicides[1:5], r)
->m
-[1] "
Found on January 1, 2007
" "
Found on January 2, 2007
"
-[3] "
Found on January 2, 2007
" "
Found on January 3, 2007
"
-[5] "
Found on January 5, 2007
"
-> gsub("
[F|f]ound on |
", "", m)
-[1] "January 1, 2007" "January 2, 2007" "January 2, 2007" "January 3, 2007"
-[5] "January 5, 2007"
-> as.Date(d, "%B %d, %Y")
-[1] "2007-01-01" "2007-01-02" "2007-01-02" "2007-01-03" "2007-01-05"
-```
-
----
-
-## regexec
-
-The `regexec` function works like regexpr except it gives you the indices for parenthesized sub-expressions.
-
-```r
-> regexec("
" "January 2, 2007"
-```
-
----
-
-## regexec
-
-Let’s make a plot of monthly homicide counts
-
-```r
-> r <- regexec("
[F|f]ound on (.*?)
", homicides)
-> m <- regmatches(homicides, r)
-> dates <- sapply(m, function(x) x[2])
-> dates <- as.Date(dates, "%B %d, %Y")
-> hist(dates, "month", freq = TRUE)
-```
-
----
-
-## regexec
-
-
-
----
-
-## Summary
-
-The primary R functions for dealing with regular expressions are
-- `grep`, `grepl`: Search for matches of a regular expression/pattern in a character vector
-- `regexpr`, `gregexpr`: Search a character vector for regular expression matches and return the indices where the match begins; useful in conjunction with `regmatches`
-- `sub`, `gsub`: Search a character vector for regular expression matches and replace that match with another string
-- `regexec`: Gives you indices of parethensized sub-expressions.
diff --git a/02_RProgramming/NOTUSED/regex/Regular Expressions.pdf b/02_RProgramming/NOTUSED/regex/Regular Expressions.pdf
deleted file mode 100644
index 3f9d25eb..00000000
Binary files a/02_RProgramming/NOTUSED/regex/Regular Expressions.pdf and /dev/null differ
diff --git a/02_RProgramming/NOTUSED/regex/index.Rmd b/02_RProgramming/NOTUSED/regex/index.Rmd
deleted file mode 100644
index f563b995..00000000
--- a/02_RProgramming/NOTUSED/regex/index.Rmd
+++ /dev/null
@@ -1,475 +0,0 @@
----
-title : Regular Expressions
-subtitle : Computing for Data Analysis
-author : Roger Peng, Associate Professor
-job : Johns Hopkins Bloomberg School of Public Health
-logo : bloomberg_shield.png
-framework : io2012 # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js # {highlight.js, prettify, highlight}
-hitheme : tomorrow #
-url:
- lib: ../../libraries
- assets: ../../assets
-widgets : [mathjax] # {mathjax, quiz, bootstrap}
-mode : selfcontained # {standalone, draft}
----
-
-## Regular expressions
-
-- Regular expressions can be thought of as a combination of literals and _metacharacters_
-- To draw an analogy with natural language, think of literal text forming the words of this language, and the metacharacters defining its grammar
-- Regular expressions have a rich set of metacharacters
-
----
-
-## Literals
-
-Simplest pattern consists only of literals. The literal “nuclear” would match to the following lines:
-
-```markdown
-Ooh. I just learned that to keep myself alive after a
-nuclear blast! All I have to do is milk some rats
-then drink the milk. Aweosme. :}
-
-Laozi says nuclear weapons are mas macho
-
-Chaos in a country that has nuclear weapons -- not good.
-
-my nephew is trying to teach me nuclear physics, or
-possibly just trying to show me how smart he is
-so I’ll be proud of him [which I am].
-
-lol if you ever say "nuclear" people immediately think
-DEATH by radiation LOL
-```
-
----
-
-## Literals
-
-The literal “Obama” would match to the following lines
-
-```markdown
-Politics r dum. Not 2 long ago Clinton was sayin Obama
-was crap n now she sez vote 4 him n unite? WTF?
-Screw em both + Mcain. Go Ron Paul!
-
-Clinton conceeds to Obama but will her followers listen??
-
-Are we sure Chelsea didn’t vote for Obama?
-
-thinking ... Michelle Obama is terrific!
-
-jetlag..no sleep...early mornig to starbux..Ms. Obama
-was moving
-```
-
----
-
-## Regular Expressions
-
-- Simplest pattern consists only of literals; a match occurs if the sequence of literals occurs anywhere in the text being tested
-
-- What if we only want the word “Obama”? or sentences that end in the word “Clinton”, or “clinton” or “clinto”?
-
----
-
-## Regular Expressions
-
-We need a way to express
-- whitespace word boundaries
-- sets of literals
-- the beginning and end of a line
-- alternatives (“war” or “peace”)
-Metacharacters to the rescue!
-
----
-
-## Metacharacters
-
-Some metacharacters represent the start of a line
-
-```markdown
-^i think
-```
-
-will match the lines
-
-```markdown
-i think we all rule for participating
-i think i have been outed
-i think this will be quite fun actually
-i think i need to go to work
-i think i first saw zombo in 1999.
-```
-
----
-
-## Metacharacters
-
-$ represents the end of a line
-
-```markdown
-morning$
-```
-
-will match the lines
-
-```markdown
-well they had something this morning
-then had to catch a tram home in the morning
-dog obedience school in the morning
-and yes happy birthday i forgot to say it earlier this morning
-I walked in the rain this morning
-good morning
-```
-
----
-
-## Character Classes with []
-
-We can list a set of characters we will accept at a given point in the match
-
-```markdown
-[Bb][Uu][Ss][Hh]
-```
-
-will match the lines
-
-```markdown
-The democrats are playing, "Name the worst thing about Bush!"
-I smelled the desert creosote bush, brownies, BBQ chicken
-BBQ and bushwalking at Molonglo Gorge
-Bush TOLD you that North Korea is part of the Axis of Evil
-I’m listening to Bush - Hurricane (Album Version)
-```
-
----
-
-## Character Classes with []
-
-```markdown
-^[Ii] am
-```
-
-will match
-
-```markdown
-i am so angry at my boyfriend i can’t even bear to
-look at him
-
-i am boycotting the apple store
-
-I am twittering from iPhone
-
-I am a very vengeful person when you ruin my sweetheart.
-
-I am so over this. I need food. Mmmm bacon...
-```
-
----
-
-## Character Classes with []
-
-Similarly, you can specify a range of letters [a-z] or [a-zA-Z]; notice that the order doesn’t matter
-
-```markdown
-^[0-9][a-zA-Z]
-```
-
-will match the lines
-
-```markdown
-7th inning stretch
-2nd half soon to begin. OSU did just win something
-3am - cant sleep - too hot still.. :(
-5ft 7 sent from heaven
-1st sign of starvagtion
-```
-
----
-
-## Character Classes with []
-
-When used at the beginning of a character class, the “^” is also a metacharacter and indicates matching characters NOT in the indicated class
-
-```markdown
-[^?.]$
-```
-
-will match the lines
-
-```markdown
-i like basketballs
-6 and 9
-dont worry... we all die anyway!
-Not in Baghdad
-helicopter under water? hmmm
-```
-
----
-
-## More Metacharacters
-
-“.” is used to refer to any character. So
-
-```markdown
-9.11
-```
-
-will match the lines
-
-```markdown
-its stupid the post 9-11 rules
-if any 1 of us did 9/11 we would have been caught in days.
-NetBios: scanning ip 203.169.114.66
-Front Door 9:11:46 AM
-Sings: 0118999881999119725...3 !
-```
-
----
-
-## More Metacharacters: |
-
-This does not mean “pipe” in the context of regular expressions; instead it translates to “or”; we can use it to combine two expressions, the subexpressions being called alternatives
-
-```markdown
-flood|fire
-```
-
-will match the lines
-
-```markdown
-is firewire like usb on none macs?
-the global flood makes sense within the context of the bible
-yeah ive had the fire on tonight
-... and the floods, hurricanes, killer heatwaves, rednecks, gun nuts, etc.
-
-```
-
----
-
-## More Metacharacters: |
-
-We can include any number of alternatives...
-
-```markdown
-flood|earthquake|hurricane|coldfire
-```
-
-will match the lines
-
-```markdown
-Not a whole lot of hurricanes in the Arctic.
-We do have earthquakes nearly every day somewhere in our State
-hurricanes swirl in the other direction
-coldfire is STRAIGHT!
-’cause we keep getting earthquakes
-```
-
----
-
-## More Metacharacters: |
-
-The alternatives can be real expressions and not just literals
-
-```markdown
-^[Gg]ood|[Bb]ad
-```
-
-will match the lines
-
-```markdown
-good to hear some good knews from someone here
-Good afternoon fellow american infidels!
-good on you-what do you drive?
-Katie... guess they had bad experiences...
-my middle name is trouble, Miss Bad News
-```
-
----
-
-## More Metacharacters: ( and )
-
-Subexpressions are often contained in parentheses to constrain the alternatives
-
-```markdown
-^([Gg]ood|[Bb]ad)
-```
-
-will match the lines
-
-```markdown
-bad habbit
-bad coordination today
-good, becuase there is nothing worse than a man in kinky underwear
-Badcop, its because people want to use drugs
-Good Monday Holiday
-Good riddance to Limey
-```
-
----
-
-## More Metacharacters: ?
-
-The question mark indicates that the indicated expression is optional
-
-```markdown
-[Gg]eorge( [Ww]\.)? [Bb]ush
-```
-
-will match the lines
-
-```markdown
-i bet i can spell better than you and george bush combined
-BBC reported that President George W. Bush claimed God told him to invade I
-a bird in the hand is worth two george bushes
-```
-
----
-
-## One thing to note...
-
-In the following
-
-```markdown
-[Gg]eorge( [Ww]\.)? [Bb]ush
-```
-
-we wanted to match a “.” as a literal period; to do that, we had to “escape” the metacharacter, preceding it with a backslash In general, we have to do this for any metacharacter we want to include in our match
-
----
-
-## More metacharacters: * and +
-
-The * and + signs are metacharacters used to indicate repetition; * means “any number, including none, of the item” and + means “at least one of the item”
-
-```markdown
-(.*)
-```
-
-will match the lines
-
-```markdown
-anyone wanna chat? (24, m, germany)
-hello, 20.m here... ( east area + drives + webcam )
-(he means older men)
-()
-```
-
----
-
-## More metacharacters: * and +
-
-The * and + signs are metacharacters used to indicate repetition; * means “any number, including none, of the item” and + means “at least one of the item”
-
-```markdown
-[0-9]+ (.*)[0-9]+
-```
-
-will match the lines
-
-```markdown
-working as MP here 720 MP battallion, 42nd birgade
-so say 2 or 3 years at colleage and 4 at uni makes us 23 when and if we fin
-it went down on several occasions for like, 3 or 4 *days*
-Mmmm its time 4 me 2 go 2 bed
-```
-
----
-
-## More metacharacters: { and }
-
-{ and } are referred to as interval quantifiers; the let us specify the minimum and maximum number of matches of an expression
-
-```markdown
-[Bb]ush( +[^ ]+ +){1,5} debate
-```
-
-will match the lines
-
-```markdown
-Bush has historically won all major debates he’s done.
-in my view, Bush doesn’t need these debates..
-bush doesn’t need the debates? maybe you are right
-That’s what Bush supporters are doing about the debate.
-Felix, I don’t disagree that Bush was poorly prepared for the debate.
-indeed, but still, Bush should have taken the debate more seriously.
-Keep repeating that Bush smirked and scowled during the debate
-```
-
----
-
-## More metacharacters: and
-
-- m,n means at least m but not more than n matches
-- m means exactly m matches
-- m, means at least m matches
-
----
-
-## More metacharacters: ( and ) revisited
-
-- In most implementations of regular expressions, the parentheses not only limit the scope of alternatives divided by a “|”, but also can be used to “remember” text matched by the subexpression enclosed
-- We refer to the matched text with \1, \2, etc.
-
----
-
-## More metacharacters: ( and ) revisited
-
-So the expression
-
-```markdown
-+([a-zA-Z]+) +\1 +
-```
-
-will match the lines
-
-```markdown
-time for bed, night night twitter!
-blah blah blah blah
-my tattoo is so so itchy today
-i was standing all all alone against the world outside...
-hi anybody anybody at home
-estudiando css css css css.... que desastritooooo
-```
-
----
-
-## More metacharacters: ( and ) revisited
-
-The * is “greedy” so it always matches the _longest_ possible string that satisfies the regular expression. So
-
-```markdown
-^s(.*)s
-```
-
-matches
-
-```markdown
-sitting at starbucks
-setting up mysql and rails
-studying stuff for the exams
-spaghetti with marshmallows
-stop fighting with crackers
-sore shoulders, stupid ergonomics
-```
-
----
-
-## More metacharacters: ( and ) revisited
-
-The greediness of * can be turned off with the ?, as in
-
-```markdown
-^s(.*?)s$
-```
-
----
-
-## Summary
-
-- Regular expressions are used in many different languages; not unique to R.
-- Regular expressions are composed of literals and metacharacters that represent sets or classes of characters/words
-- Text processing via regular expressions is a very powerful way to extract data from “unfriendly” sources (not all data comes as a CSV file)
-(Thanks to Mark Hansen for some material in this lecture.)
\ No newline at end of file
diff --git a/02_RProgramming/NOTUSED/regex/index.html b/02_RProgramming/NOTUSED/regex/index.html
deleted file mode 100644
index b905e3a7..00000000
--- a/02_RProgramming/NOTUSED/regex/index.html
+++ /dev/null
@@ -1,649 +0,0 @@
-
-
-
- Regular Expressions
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Regular Expressions
-
Computing for Data Analysis
-
Roger Peng, Associate Professor Johns Hopkins Bloomberg School of Public Health
-
-
-
-
-
-
-
Regular expressions
-
-
-
-
Regular expressions can be thought of as a combination of literals and metacharacters
-
To draw an analogy with natural language, think of literal text forming the words of this language, and the metacharacters defining its grammar
-
Regular expressions have a rich set of metacharacters
-
-
-
-
-
-
-
-
-
Literals
-
-
-
Simplest pattern consists only of literals. The literal “nuclear” would match to the following lines:
-
-
Ooh. I just learned that to keep myself alive after a
-nuclear blast! All I have to do is milk some rats
-then drink the milk. Aweosme. :}
-
-Laozi says nuclear weapons are mas macho
-
-Chaos in a country that has nuclear weapons -- not good.
-
-my nephew is trying to teach me nuclear physics, or
-possibly just trying to show me how smart he is
-so I’ll be proud of him [which I am].
-
-lol if you ever say "nuclear" people immediately think
-DEATH by radiation LOL
-
-
-
-
-
-
-
-
-
Literals
-
-
-
The literal “Obama” would match to the following lines
-
-
Politics r dum. Not 2 long ago Clinton was sayin Obama
-was crap n now she sez vote 4 him n unite? WTF?
-Screw em both + Mcain. Go Ron Paul!
-
-Clinton conceeds to Obama but will her followers listen??
-
-Are we sure Chelsea didn’t vote for Obama?
-
-thinking ... Michelle Obama is terrific!
-
-jetlag..no sleep...early mornig to starbux..Ms. Obama
-was moving
-
-
-
-
-
-
-
-
-
Regular Expressions
-
-
-
-
Simplest pattern consists only of literals; a match occurs if the sequence of literals occurs anywhere in the text being tested
-
What if we only want the word “Obama”? or sentences that end in the word “Clinton”, or “clinton” or “clinto”?
-
-
-
-
-
-
-
-
-
Regular Expressions
-
-
-
We need a way to express
-
-
-
whitespace word boundaries
-
sets of literals
-
the beginning and end of a line
-
alternatives (“war” or “peace”)
-Metacharacters to the rescue!
-
-
-
-
-
-
-
-
-
Metacharacters
-
-
-
Some metacharacters represent the start of a line
-
-
^i think
-
-
-
will match the lines
-
-
i think we all rule for participating
-i think i have been outed
-i think this will be quite fun actually
-i think i need to go to work
-i think i first saw zombo in 1999.
-
-
-
-
-
-
-
-
-
Metacharacters
-
-
-
$ represents the end of a line
-
-
morning$
-
-
-
will match the lines
-
-
well they had something this morning
-then had to catch a tram home in the morning
-dog obedience school in the morning
-and yes happy birthday i forgot to say it earlier this morning
-I walked in the rain this morning
-good morning
-
-
-
-
-
-
-
-
-
Character Classes with []
-
-
-
We can list a set of characters we will accept at a given point in the match
-
-
[Bb][Uu][Ss][Hh]
-
-
-
will match the lines
-
-
The democrats are playing, "Name the worst thing about Bush!"
-I smelled the desert creosote bush, brownies, BBQ chicken
-BBQ and bushwalking at Molonglo Gorge
-Bush TOLD you that North Korea is part of the Axis of Evil
-I’m listening to Bush - Hurricane (Album Version)
-
-
-
-
-
-
-
-
-
Character Classes with []
-
-
-
^[Ii] am
-
-
-
will match
-
-
i am so angry at my boyfriend i can’t even bear to
-look at him
-
-i am boycotting the apple store
-
-I am twittering from iPhone
-
-I am a very vengeful person when you ruin my sweetheart.
-
-I am so over this. I need food. Mmmm bacon...
-
-
-
-
-
-
-
-
-
Character Classes with []
-
-
-
Similarly, you can specify a range of letters [a-z] or [a-zA-Z]; notice that the order doesn’t matter
-
-
^[0-9][a-zA-Z]
-
-
-
will match the lines
-
-
7th inning stretch
-2nd half soon to begin. OSU did just win something
-3am - cant sleep - too hot still.. :(
-5ft 7 sent from heaven
-1st sign of starvagtion
-
-
-
-
-
-
-
-
-
Character Classes with []
-
-
-
When used at the beginning of a character class, the “” is also a metacharacter and indicates matching characters NOT in the indicated class
-
-
[^?.]$
-
-
-
will match the lines
-
-
i like basketballs
-6 and 9
-dont worry... we all die anyway!
-Not in Baghdad
-helicopter under water? hmmm
-
-
-
-
-
-
-
-
-
More Metacharacters
-
-
-
“.” is used to refer to any character. So
-
-
9.11
-
-
-
will match the lines
-
-
its stupid the post 9-11 rules
-if any 1 of us did 9/11 we would have been caught in days.
-NetBios: scanning ip 203.169.114.66
-Front Door 9:11:46 AM
-Sings: 0118999881999119725...3 !
-
-
-
-
-
-
-
-
-
More Metacharacters: |
-
-
-
This does not mean “pipe” in the context of regular expressions; instead it translates to “or”; we can use it to combine two expressions, the subexpressions being called alternatives
-
-
flood|fire
-
-
-
will match the lines
-
-
is firewire like usb on none macs?
-the global flood makes sense within the context of the bible
-yeah ive had the fire on tonight
-... and the floods, hurricanes, killer heatwaves, rednecks, gun nuts, etc.
-
-
-
-
-
-
-
-
-
-
More Metacharacters: |
-
-
-
We can include any number of alternatives...
-
-
flood|earthquake|hurricane|coldfire
-
-
-
will match the lines
-
-
Not a whole lot of hurricanes in the Arctic.
-We do have earthquakes nearly every day somewhere in our State
-hurricanes swirl in the other direction
-coldfire is STRAIGHT!
-’cause we keep getting earthquakes
-
-
-
-
-
-
-
-
-
More Metacharacters: |
-
-
-
The alternatives can be real expressions and not just literals
-
-
^[Gg]ood|[Bb]ad
-
-
-
will match the lines
-
-
good to hear some good knews from someone here
-Good afternoon fellow american infidels!
-good on you-what do you drive?
-Katie... guess they had bad experiences...
-my middle name is trouble, Miss Bad News
-
-
-
-
-
-
-
-
-
More Metacharacters: ( and )
-
-
-
Subexpressions are often contained in parentheses to constrain the alternatives
-
-
^([Gg]ood|[Bb]ad)
-
-
-
will match the lines
-
-
bad habbit
-bad coordination today
-good, becuase there is nothing worse than a man in kinky underwear
-Badcop, its because people want to use drugs
-Good Monday Holiday
-Good riddance to Limey
-
-
-
-
-
-
-
-
-
More Metacharacters: ?
-
-
-
The question mark indicates that the indicated expression is optional
-
-
[Gg]eorge( [Ww]\.)? [Bb]ush
-
-
-
will match the lines
-
-
i bet i can spell better than you and george bush combined
-BBC reported that President George W. Bush claimed God told him to invade I
-a bird in the hand is worth two george bushes
-
-
-
-
-
-
-
-
-
One thing to note...
-
-
-
In the following
-
-
[Gg]eorge( [Ww]\.)? [Bb]ush
-
-
-
we wanted to match a “.” as a literal period; to do that, we had to “escape” the metacharacter, preceding it with a backslash In general, we have to do this for any metacharacter we want to include in our match
-
-
-
-
-
-
-
-
More metacharacters: * and +
-
-
-
The * and + signs are metacharacters used to indicate repetition; * means “any number, including none, of the item” and + means “at least one of the item”
-
-
(.*)
-
-
-
will match the lines
-
-
anyone wanna chat? (24, m, germany)
-hello, 20.m here... ( east area + drives + webcam )
-(he means older men)
-()
-
-
-
-
-
-
-
-
-
More metacharacters: * and +
-
-
-
The * and + signs are metacharacters used to indicate repetition; * means “any number, including none, of the item” and + means “at least one of the item”
-
-
[0-9]+ (.*)[0-9]+
-
-
-
will match the lines
-
-
working as MP here 720 MP battallion, 42nd birgade
-so say 2 or 3 years at colleage and 4 at uni makes us 23 when and if we fin
-it went down on several occasions for like, 3 or 4 *days*
-Mmmm its time 4 me 2 go 2 bed
-
-
-
-
-
-
-
-
-
More metacharacters: { and }
-
-
-
{ and } are referred to as interval quantifiers; the let us specify the minimum and maximum number of matches of an expression
-
-
[Bb]ush( +[^ ]+ +){1,5} debate
-
-
-
will match the lines
-
-
Bush has historically won all major debates he’s done.
-in my view, Bush doesn’t need these debates..
-bush doesn’t need the debates? maybe you are right
-That’s what Bush supporters are doing about the debate.
-Felix, I don’t disagree that Bush was poorly prepared for the debate.
-indeed, but still, Bush should have taken the debate more seriously.
-Keep repeating that Bush smirked and scowled during the debate
-
-
-
-
-
-
-
-
-
More metacharacters: and
-
-
-
-
m,n means at least m but not more than n matches
-
m means exactly m matches
-
m, means at least m matches
-
-
-
-
-
-
-
-
-
More metacharacters: ( and ) revisited
-
-
-
-
In most implementations of regular expressions, the parentheses not only limit the scope of alternatives divided by a “|”, but also can be used to “remember” text matched by the subexpression enclosed
-
We refer to the matched text with \1, \2, etc.
-
-
-
-
-
-
-
-
-
More metacharacters: ( and ) revisited
-
-
-
So the expression
-
-
+([a-zA-Z]+) +\1 +
-
-
-
will match the lines
-
-
time for bed, night night twitter!
-blah blah blah blah
-my tattoo is so so itchy today
-i was standing all all alone against the world outside...
-hi anybody anybody at home
-estudiando css css css css.... que desastritooooo
-
-
-
-
-
-
-
-
-
More metacharacters: ( and ) revisited
-
-
-
The * is “greedy” so it always matches the longest possible string that satisfies the regular expression. So
-
-
^s(.*)s
-
-
-
matches
-
-
sitting at starbucks
-setting up mysql and rails
-studying stuff for the exams
-spaghetti with marshmallows
-stop fighting with crackers
-sore shoulders, stupid ergonomics
-
-
-
-
-
-
-
-
-
More metacharacters: ( and ) revisited
-
-
-
The greediness of * can be turned off with the ?, as in
-
-
^s(.*?)s$
-
-
-
-
-
-
-
-
-
Summary
-
-
-
-
Regular expressions are used in many different languages; not unique to R.
-
Regular expressions are composed of literals and metacharacters that represent sets or classes of characters/words
-
Text processing via regular expressions is a very powerful way to extract data from “unfriendly” sources (not all data comes as a CSV file)
-(Thanks to Mark Hansen for some material in this lecture.)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
\ No newline at end of file
diff --git a/02_RProgramming/NOTUSED/regex/index.md b/02_RProgramming/NOTUSED/regex/index.md
deleted file mode 100644
index 8da8f474..00000000
--- a/02_RProgramming/NOTUSED/regex/index.md
+++ /dev/null
@@ -1,475 +0,0 @@
----
-title : Regular Expressions
-subtitle : Computing for Data Analysis
-author : Roger Peng, Associate Professor
-job : Johns Hopkins Bloomberg School of Public Health
-logo : bloomberg_shield.png
-framework : io2012 # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js # {highlight.js, prettify, highlight}
-hitheme : tomorrow #
-url:
- lib: ../../libraries
- assets: ../../assets
-widgets : [mathjax] # {mathjax, quiz, bootstrap}
-mode : selfcontained # {standalone, draft}
----
-
-## Regular expressions
-
-- Regular expressions can be thought of as a combination of literals and _metacharacters_
-- To draw an analogy with natural language, think of literal text forming the words of this language, and the metacharacters defining its grammar
-- Regular expressions have a rich set of metacharacters
-
----
-
-## Literals
-
-Simplest pattern consists only of literals. The literal “nuclear” would match to the following lines:
-
-```markdown
-Ooh. I just learned that to keep myself alive after a
-nuclear blast! All I have to do is milk some rats
-then drink the milk. Aweosme. :}
-
-Laozi says nuclear weapons are mas macho
-
-Chaos in a country that has nuclear weapons -- not good.
-
-my nephew is trying to teach me nuclear physics, or
-possibly just trying to show me how smart he is
-so I’ll be proud of him [which I am].
-
-lol if you ever say "nuclear" people immediately think
-DEATH by radiation LOL
-```
-
----
-
-## Literals
-
-The literal “Obama” would match to the following lines
-
-```markdown
-Politics r dum. Not 2 long ago Clinton was sayin Obama
-was crap n now she sez vote 4 him n unite? WTF?
-Screw em both + Mcain. Go Ron Paul!
-
-Clinton conceeds to Obama but will her followers listen??
-
-Are we sure Chelsea didn’t vote for Obama?
-
-thinking ... Michelle Obama is terrific!
-
-jetlag..no sleep...early mornig to starbux..Ms. Obama
-was moving
-```
-
----
-
-## Regular Expressions
-
-- Simplest pattern consists only of literals; a match occurs if the sequence of literals occurs anywhere in the text being tested
-
-- What if we only want the word “Obama”? or sentences that end in the word “Clinton”, or “clinton” or “clinto”?
-
----
-
-## Regular Expressions
-
-We need a way to express
-- whitespace word boundaries
-- sets of literals
-- the beginning and end of a line
-- alternatives (“war” or “peace”)
-Metacharacters to the rescue!
-
----
-
-## Metacharacters
-
-Some metacharacters represent the start of a line
-
-```markdown
-^i think
-```
-
-will match the lines
-
-```markdown
-i think we all rule for participating
-i think i have been outed
-i think this will be quite fun actually
-i think i need to go to work
-i think i first saw zombo in 1999.
-```
-
----
-
-## Metacharacters
-
-$ represents the end of a line
-
-```markdown
-morning$
-```
-
-will match the lines
-
-```markdown
-well they had something this morning
-then had to catch a tram home in the morning
-dog obedience school in the morning
-and yes happy birthday i forgot to say it earlier this morning
-I walked in the rain this morning
-good morning
-```
-
----
-
-## Character Classes with []
-
-We can list a set of characters we will accept at a given point in the match
-
-```markdown
-[Bb][Uu][Ss][Hh]
-```
-
-will match the lines
-
-```markdown
-The democrats are playing, "Name the worst thing about Bush!"
-I smelled the desert creosote bush, brownies, BBQ chicken
-BBQ and bushwalking at Molonglo Gorge
-Bush TOLD you that North Korea is part of the Axis of Evil
-I’m listening to Bush - Hurricane (Album Version)
-```
-
----
-
-## Character Classes with []
-
-```markdown
-^[Ii] am
-```
-
-will match
-
-```markdown
-i am so angry at my boyfriend i can’t even bear to
-look at him
-
-i am boycotting the apple store
-
-I am twittering from iPhone
-
-I am a very vengeful person when you ruin my sweetheart.
-
-I am so over this. I need food. Mmmm bacon...
-```
-
----
-
-## Character Classes with []
-
-Similarly, you can specify a range of letters [a-z] or [a-zA-Z]; notice that the order doesn’t matter
-
-```markdown
-^[0-9][a-zA-Z]
-```
-
-will match the lines
-
-```markdown
-7th inning stretch
-2nd half soon to begin. OSU did just win something
-3am - cant sleep - too hot still.. :(
-5ft 7 sent from heaven
-1st sign of starvagtion
-```
-
----
-
-## Character Classes with []
-
-When used at the beginning of a character class, the “^” is also a metacharacter and indicates matching characters NOT in the indicated class
-
-```markdown
-[^?.]$
-```
-
-will match the lines
-
-```markdown
-i like basketballs
-6 and 9
-dont worry... we all die anyway!
-Not in Baghdad
-helicopter under water? hmmm
-```
-
----
-
-## More Metacharacters
-
-“.” is used to refer to any character. So
-
-```markdown
-9.11
-```
-
-will match the lines
-
-```markdown
-its stupid the post 9-11 rules
-if any 1 of us did 9/11 we would have been caught in days.
-NetBios: scanning ip 203.169.114.66
-Front Door 9:11:46 AM
-Sings: 0118999881999119725...3 !
-```
-
----
-
-## More Metacharacters: |
-
-This does not mean “pipe” in the context of regular expressions; instead it translates to “or”; we can use it to combine two expressions, the subexpressions being called alternatives
-
-```markdown
-flood|fire
-```
-
-will match the lines
-
-```markdown
-is firewire like usb on none macs?
-the global flood makes sense within the context of the bible
-yeah ive had the fire on tonight
-... and the floods, hurricanes, killer heatwaves, rednecks, gun nuts, etc.
-
-```
-
----
-
-## More Metacharacters: |
-
-We can include any number of alternatives...
-
-```markdown
-flood|earthquake|hurricane|coldfire
-```
-
-will match the lines
-
-```markdown
-Not a whole lot of hurricanes in the Arctic.
-We do have earthquakes nearly every day somewhere in our State
-hurricanes swirl in the other direction
-coldfire is STRAIGHT!
-’cause we keep getting earthquakes
-```
-
----
-
-## More Metacharacters: |
-
-The alternatives can be real expressions and not just literals
-
-```markdown
-^[Gg]ood|[Bb]ad
-```
-
-will match the lines
-
-```markdown
-good to hear some good knews from someone here
-Good afternoon fellow american infidels!
-good on you-what do you drive?
-Katie... guess they had bad experiences...
-my middle name is trouble, Miss Bad News
-```
-
----
-
-## More Metacharacters: ( and )
-
-Subexpressions are often contained in parentheses to constrain the alternatives
-
-```markdown
-^([Gg]ood|[Bb]ad)
-```
-
-will match the lines
-
-```markdown
-bad habbit
-bad coordination today
-good, becuase there is nothing worse than a man in kinky underwear
-Badcop, its because people want to use drugs
-Good Monday Holiday
-Good riddance to Limey
-```
-
----
-
-## More Metacharacters: ?
-
-The question mark indicates that the indicated expression is optional
-
-```markdown
-[Gg]eorge( [Ww]\.)? [Bb]ush
-```
-
-will match the lines
-
-```markdown
-i bet i can spell better than you and george bush combined
-BBC reported that President George W. Bush claimed God told him to invade I
-a bird in the hand is worth two george bushes
-```
-
----
-
-## One thing to note...
-
-In the following
-
-```markdown
-[Gg]eorge( [Ww]\.)? [Bb]ush
-```
-
-we wanted to match a “.” as a literal period; to do that, we had to “escape” the metacharacter, preceding it with a backslash In general, we have to do this for any metacharacter we want to include in our match
-
----
-
-## More metacharacters: * and +
-
-The * and + signs are metacharacters used to indicate repetition; * means “any number, including none, of the item” and + means “at least one of the item”
-
-```markdown
-(.*)
-```
-
-will match the lines
-
-```markdown
-anyone wanna chat? (24, m, germany)
-hello, 20.m here... ( east area + drives + webcam )
-(he means older men)
-()
-```
-
----
-
-## More metacharacters: * and +
-
-The * and + signs are metacharacters used to indicate repetition; * means “any number, including none, of the item” and + means “at least one of the item”
-
-```markdown
-[0-9]+ (.*)[0-9]+
-```
-
-will match the lines
-
-```markdown
-working as MP here 720 MP battallion, 42nd birgade
-so say 2 or 3 years at colleage and 4 at uni makes us 23 when and if we fin
-it went down on several occasions for like, 3 or 4 *days*
-Mmmm its time 4 me 2 go 2 bed
-```
-
----
-
-## More metacharacters: { and }
-
-{ and } are referred to as interval quantifiers; the let us specify the minimum and maximum number of matches of an expression
-
-```markdown
-[Bb]ush( +[^ ]+ +){1,5} debate
-```
-
-will match the lines
-
-```markdown
-Bush has historically won all major debates he’s done.
-in my view, Bush doesn’t need these debates..
-bush doesn’t need the debates? maybe you are right
-That’s what Bush supporters are doing about the debate.
-Felix, I don’t disagree that Bush was poorly prepared for the debate.
-indeed, but still, Bush should have taken the debate more seriously.
-Keep repeating that Bush smirked and scowled during the debate
-```
-
----
-
-## More metacharacters: and
-
-- m,n means at least m but not more than n matches
-- m means exactly m matches
-- m, means at least m matches
-
----
-
-## More metacharacters: ( and ) revisited
-
-- In most implementations of regular expressions, the parentheses not only limit the scope of alternatives divided by a “|”, but also can be used to “remember” text matched by the subexpression enclosed
-- We refer to the matched text with \1, \2, etc.
-
----
-
-## More metacharacters: ( and ) revisited
-
-So the expression
-
-```markdown
-+([a-zA-Z]+) +\1 +
-```
-
-will match the lines
-
-```markdown
-time for bed, night night twitter!
-blah blah blah blah
-my tattoo is so so itchy today
-i was standing all all alone against the world outside...
-hi anybody anybody at home
-estudiando css css css css.... que desastritooooo
-```
-
----
-
-## More metacharacters: ( and ) revisited
-
-The * is “greedy” so it always matches the _longest_ possible string that satisfies the regular expression. So
-
-```markdown
-^s(.*)s
-```
-
-matches
-
-```markdown
-sitting at starbucks
-setting up mysql and rails
-studying stuff for the exams
-spaghetti with marshmallows
-stop fighting with crackers
-sore shoulders, stupid ergonomics
-```
-
----
-
-## More metacharacters: ( and ) revisited
-
-The greediness of * can be turned off with the ?, as in
-
-```markdown
-^s(.*?)s$
-```
-
----
-
-## Summary
-
-- Regular expressions are used in many different languages; not unique to R.
-- Regular expressions are composed of literals and metacharacters that represent sets or classes of characters/words
-- Text processing via regular expressions is a very powerful way to extract data from “unfriendly” sources (not all data comes as a CSV file)
-(Thanks to Mark Hansen for some material in this lecture.)
diff --git a/02_RProgramming/Subsetting/index.Rmd b/02_RProgramming/Subsetting/index.Rmd
index 64179968..88816007 100644
--- a/02_RProgramming/Subsetting/index.Rmd
+++ b/02_RProgramming/Subsetting/index.Rmd
@@ -21,7 +21,7 @@ There are a number of operators that can be used to extract subsets of R objects
- `[[` is used to extract elements of a list or a data frame; it can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame
-- `$` is used to extract elements of a list or data frame by name; semantics are similar to hat of `[[`.
+- `$` is used to extract elements of a list or data frame by name; semantics are similar to that of `[[`.
---
@@ -237,4 +237,4 @@ What if there are multiple things and you want to take the subset with no missin
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
7 23 299 8.6 65 5 7
-```
\ No newline at end of file
+```
diff --git a/02_RProgramming/Subsetting/index.html b/02_RProgramming/Subsetting/index.html
index 52258492..e907844a 100644
--- a/02_RProgramming/Subsetting/index.html
+++ b/02_RProgramming/Subsetting/index.html
@@ -19,6 +19,11 @@
+
+
+
+
+
@@ -53,7 +58,7 @@
Subsetting
[ always returns an object of the same class as the original; can be used to select more than one element (there is one exception)
[[ is used to extract elements of a list or a data frame; it can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame
-
$ is used to extract elements of a list or data frame by name; semantics are similar to hat of [[.
+
$ is used to extract elements of a list or data frame by name; semantics are similar to that of [[.
diff --git a/02_RProgramming/Subsetting/index.md b/02_RProgramming/Subsetting/index.md
index f236e5ae..88816007 100644
--- a/02_RProgramming/Subsetting/index.md
+++ b/02_RProgramming/Subsetting/index.md
@@ -21,7 +21,7 @@ There are a number of operators that can be used to extract subsets of R objects
- `[[` is used to extract elements of a list or a data frame; it can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame
-- `$` is used to extract elements of a list or data frame by name; semantics are similar to hat of `[[`.
+- `$` is used to extract elements of a list or data frame by name; semantics are similar to that of `[[`.
---
diff --git a/02_RProgramming/help/GettingHelp.pdf b/02_RProgramming/help/GettingHelp.pdf
deleted file mode 100644
index 0b84067a..00000000
Binary files a/02_RProgramming/help/GettingHelp.pdf and /dev/null differ
diff --git a/02_RProgramming/help/slides/help_slide01.png b/02_RProgramming/help/slides/help_slide01.png
index 3288e691..8e909a13 100644
Binary files a/02_RProgramming/help/slides/help_slide01.png and b/02_RProgramming/help/slides/help_slide01.png differ
diff --git a/02_RProgramming/help/slides/help_slide02.png b/02_RProgramming/help/slides/help_slide02.png
index deb60190..59b313df 100644
Binary files a/02_RProgramming/help/slides/help_slide02.png and b/02_RProgramming/help/slides/help_slide02.png differ
diff --git a/02_RProgramming/help/slides/help_slide03.png b/02_RProgramming/help/slides/help_slide03.png
index 3a011302..2a25751e 100644
Binary files a/02_RProgramming/help/slides/help_slide03.png and b/02_RProgramming/help/slides/help_slide03.png differ
diff --git a/02_RProgramming/help/slides/help_slide04.png b/02_RProgramming/help/slides/help_slide04.png
index 08859e73..11cd28fd 100644
Binary files a/02_RProgramming/help/slides/help_slide04.png and b/02_RProgramming/help/slides/help_slide04.png differ
diff --git a/02_RProgramming/help/slides/help_slide05.png b/02_RProgramming/help/slides/help_slide05.png
index f345144c..da1cc664 100644
Binary files a/02_RProgramming/help/slides/help_slide05.png and b/02_RProgramming/help/slides/help_slide05.png differ
diff --git a/02_RProgramming/help/slides/help_slide06.png b/02_RProgramming/help/slides/help_slide06.png
index a6aa30a6..e738a843 100644
Binary files a/02_RProgramming/help/slides/help_slide06.png and b/02_RProgramming/help/slides/help_slide06.png differ
diff --git a/02_RProgramming/help/slides/help_slide07.png b/02_RProgramming/help/slides/help_slide07.png
index 3a7d8bd5..0774b91b 100644
Binary files a/02_RProgramming/help/slides/help_slide07.png and b/02_RProgramming/help/slides/help_slide07.png differ
diff --git a/02_RProgramming/help/slides/help_slide08.png b/02_RProgramming/help/slides/help_slide08.png
index 38a6a0b9..b609bc07 100644
Binary files a/02_RProgramming/help/slides/help_slide08.png and b/02_RProgramming/help/slides/help_slide08.png differ
diff --git a/02_RProgramming/help/slides/help_slide09.png b/02_RProgramming/help/slides/help_slide09.png
index 03de7e78..051e8c40 100644
Binary files a/02_RProgramming/help/slides/help_slide09.png and b/02_RProgramming/help/slides/help_slide09.png differ
diff --git a/02_RProgramming/help/slides/help_slide10.png b/02_RProgramming/help/slides/help_slide10.png
index bfbfbc7b..403aff32 100644
Binary files a/02_RProgramming/help/slides/help_slide10.png and b/02_RProgramming/help/slides/help_slide10.png differ
diff --git a/02_RProgramming/help/slides/help_slide11.png b/02_RProgramming/help/slides/help_slide11.png
index e79f52c2..4279c36d 100644
Binary files a/02_RProgramming/help/slides/help_slide11.png and b/02_RProgramming/help/slides/help_slide11.png differ
diff --git a/02_RProgramming/help/slides/help_slide12.png b/02_RProgramming/help/slides/help_slide12.png
index e9f98bf0..2515a669 100644
Binary files a/02_RProgramming/help/slides/help_slide12.png and b/02_RProgramming/help/slides/help_slide12.png differ
diff --git a/02_RProgramming/help/slides/help_slide13.png b/02_RProgramming/help/slides/help_slide13.png
index cab85784..e37057d1 100644
Binary files a/02_RProgramming/help/slides/help_slide13.png and b/02_RProgramming/help/slides/help_slide13.png differ
diff --git a/02_RProgramming/help/slides/help_slide14.png b/02_RProgramming/help/slides/help_slide14.png
index 2bea08c7..cee06269 100644
Binary files a/02_RProgramming/help/slides/help_slide14.png and b/02_RProgramming/help/slides/help_slide14.png differ
diff --git a/02_RProgramming/lectures/Subsetting.pdf b/02_RProgramming/lectures/Subsetting.pdf
index 92b158e2..0e576a6d 100644
Binary files a/02_RProgramming/lectures/Subsetting.pdf and b/02_RProgramming/lectures/Subsetting.pdf differ