A small article for those who want to experience with Regular Expressions in PHP. Regular expressions, also referred to as regex or regexp, provide a concise and flexible means for matching strings of text, such as particular characters, words, or patterns of characters. Regular expressions are used by many text editors, utilities, and programming languages to search and manipulate text based on patterns.
OK. Let’s now try a small example. Let’s try to find the URL defined in the HREF attribute and the Link Text in all the tags present in an HTML string.
This is the HTML we have:
<html>
<body>
<a href=”http://www.google.com”>Google</a>
<a href=”http://www.yahoo.com”>Yahoo</a>
</body>
</html>
We will now find the href value and the link text in the above html code. So we are expecting an output like this.
http://www.google.com – Google
http://www.yahoo.com – Yahoo
Here is the Regular Expression for this.
preg_match_all("/\<a.*href=\"(.*?)\".*?\>(.*)\<\/a\>+/", $yourhtmlstring, $matches, PREG_SET_ORDER);
- PREG_SET_ORDER is used order results so that $matches[0] is an array of first set of matches, $matches[1] is an array of second set of matches, and so on.
All the matchings found will be returned in the $matches array. Let’s see the content of the $matches array.
Array
(
[0] => Array
(
[0] => Google
[1] => http://www.google.com
[2] => Google
)
[1] => Array
(
[0] => Yahoo
[1] => http://www.yahoo.com
[2] => Yahoo
)
)
A simple script for you to try:
<?php
if(count($_POST)) {
preg_match_all("/\<a.*href=\"(.*?)\".*?\>(.*)\<\/a\>+/", stripslashes($_POST['data']), $matches, PREG_SET_ORDER);
foreach($matches as $key=>$match) {
echo htmlentities($match[2]).' : '.$match[1]."<br/>";
}
}
?>
<br/>
<br/>
<form action="" enctype="multipart/form-data" method="post">
<textarea name="data" rows="10" cols="100"></textarea><br>
<input type="submit" name="submit"/>
</form>
This script when run shows a text area where you can paste your html code with <a> tags in it. Submit the form and you can see the links extracted.
More articles on Regular Expression coming soon!
Enjoy!



it is not working for me
Hi Suraj,
There was an error in the code as some characters appeared as html entities.
I have updated the code above.
Try and let me know.
Regards,
Anees
Thanks for your kind reply.
This is what I wanted .Thank you….
Great to know Suraj. Have fun!