February « 2011 « Security related discussions, articles, and tutorials

Quick Website Link puller script

I know it has been a while but what can I say I have been busy =)

Anyway, I scripted a quick link puller script that parses a website source code and prints all links within the code to the terminal. I found it useful so I thought I would share it with you all. I know there are some other programs out there that probably do the same thing, but hey I like the challenge of thinking through the scripting process and miking my scripts work. Hope you all enjoy.

ex: ./link_puller.py http://pentest-labs.org

Script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/usr/bin/python

"""URL Puller - pulls the source and parses links from a specified website"""

import urllib2,sys

usage = '''
link_puller.py coded by: bostonlink @ pentest-labs.org
example: ./link_puller.py http://pentest-labs.org
'''


if len(sys.argv) != 2:
    print(usage)
    sys.exit(0)

url_html = urllib2.urlopen(sys.argv[1])
html_read = url_html.read()

for url in html_read.split():
    if 'http://' in url:
        if 'href=' in url:
            urls = url.lstrip('href=').split('>')
            for i in urls:
                if 'http://' in i:
                    print(i.lstrip("'\"").rstrip("'\""))
    else:
        continue
No Comments »