[Noisebridge-discuss] city of oakland internal emails dump on DocumentCloud

Flatline flatline at hackbloc.org
Mon Mar 5 17:58:51 PST 2012


Or here it all is in one handy text file:
http://s3.documentcloud.org/documents/320449/oakland-city-official-emails-10-11-2011-to-11-13.txt

Flatline
http://www.hackbloc.org

On 03/05/2012 02:35 PM, Nicholas Granado wrote:
> you can run the following code to download them all ....
> 
> #!/usr/bin/python
> import os
> import sys
> import socket
> import urllib
> import urllib2
> 
> def download_image_url(url):
> request = urllib2.Request(url)
> opener = urllib2.build_opener(urllib2.HTTPRedirectHandler(),
> urllib2.HTTPHandler(debuglevel=0))
> handle = opener.open(request)
> payload = handle.read()
> filename = url.split('/')[6]
> image_filename = "./data/%s" % (filename)
> fh = open(image_filename, 'w')
> fh.write(payload)
> fh.close()
> print "%s" % (filename)
> 
> def main():
> for i in range(1, 2184):
> url =
> "http://s3.documentcloud.org/documents/320449/pages/oakland-city-official-emails-10-11-2011-to-11-13-p%d-normal.gif"
> % (i)
> download_image_url(url)
> 
> if __name__ == "__main__":
> main()
> 
> nick
> 
> 
> 
> On Mon, Mar 5, 2012 at 2:21 PM, Nicholas Granado <ngranado at gmail.com
> <mailto:ngranado at gmail.com>> wrote:
> 
>     they are gif files. the file format is ....
> 
>     http://s3.documentcloud.org/documents/320449/pages/oakland-city-official-emails-10-11-2011-to-11-13-p#-normal.gif
> 
>     so for example if i wanted page 54
> 
>     http://s3.documentcloud.org/documents/320449/pages/oakland-city-official-emails-10-11-2011-to-11-13-p54-normal.gif
> 
>     cheers,
>     nick
> 
> 
> 
> 
>     On Mon, Mar 5, 2012 at 2:18 PM, Jake <jake at spaz.org
>     <mailto:jake at spaz.org>> wrote:
> 
>         does anyone know how to download the entire 2183 pages?
>         I couldn't find a download button :)
> 
>         http://www.mercurynews.com/documents/ci_20040081
>         _______________________________________________
>         Noisebridge-discuss mailing list
>         Noisebridge-discuss at lists.noisebridge.net
>         <mailto:Noisebridge-discuss at lists.noisebridge.net>
>         https://www.noisebridge.net/mailman/listinfo/noisebridge-discuss
> 
> 
> 
> 
> 
> _______________________________________________
> Noisebridge-discuss mailing list
> Noisebridge-discuss at lists.noisebridge.net
> https://www.noisebridge.net/mailman/listinfo/noisebridge-discuss

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
Url : http://www.noisebridge.net/pipermail/noisebridge-discuss/attachments/20120305/80c1b90e/attachment.pgp 


More information about the Noisebridge-discuss mailing list