1、如果要想模拟浏览器发送get请求,就要使用Request对象,通过Request对象添加HTTP头,就可以伪装成浏览器。
from urllib impor request
req=request.Request("http://www.bnaid.com")
req.add_header('User_Agent',, 'Mozilla/6.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/8.0 Mobile/10A5376e Safari/8536.25'
)
with request.urlopen(req)as f:
print("Status",f.status, f.reason)
for k,v in f.getheaders():
print("Data", f.read().decode('utf-8'))
2、如果发送的是post请求只需要把参数data以bites形式传入即可
from urllib import request, parse print('Login to weibo.cn...') email = input('Email: ') passwd = input('Password: ') login_data = parse.urlencode([ ('username', email), ('password', passwd), ('entry', 'mweibo'), ('client_id', ''), ('savestate', '1'), ('ec', ''), ('pagerefer', 'https://passport.weibo.cn/signin/welcome?entry=mweibo&r=http%3A%2F%2Fm.weibo.cn%2F') ]) req = request.Request('https://passport.weibo.cn/sso/login') req.add_header('Origin', 'https://passport.weibo.cn') req.add_header('User-Agent', 'Mozilla/6.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/8.0 Mobile/10A5376e Safari/8536.25') req.add_header('Referer', 'https://passport.weibo.cn/signin/login?entry=mweibo&res=wel&wm=3349&r=http%3A%2F%2Fm.weibo.cn%2F') with request.urlopen(req, data=login_data.encode('utf-8')) as f: print('Status:', f.status, f.reason) for k, v in f.getheaders(): print('%s: %s' % (k, v)) print('Data:', f.read().decode('utf-8')) 3、如果还有需要更加复杂的控制,通过Proxy 访问网站,就要利用procyHandler来处理。
from urllib import request, parse # print('Login to weibo.cn...') # email = input('Email: ') # passwd = input('Password: ') login_data = parse.urlencode([ # ('username', email), # ('password', passwd), ('entry', 'mweibo'), ('client_id', ''), ('savestate', '1'), ('ec', ''), ('pagerefer', 'http://www.douban.com/') ]) # req = request.Request('http://www.douban.com/') # req.add_header('Origin', 'https://passport.weibo.cn') # req.add_header('User-Agent', 'Mozilla/6.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/8.0 Mobile/10A5376e Safari/8536.25') # req.add_header('Referer', 'https://passport.weibo.cn/signin/login?entry=mweibo&res=wel&wm=3349&r=http%3A%2F%2Fm.weibo.cn%2F') proxy_handler = request.ProxyHandler({ 'http': 'http://www.douban.com/'}) proxy_auth_handler = request.ProxyBasicAuthHandler() proxy_auth_handler.add_password('realm', 'host', 'username', 'password') # opener = request.build_opener(proxy_handler, proxy_auth_handler) # with opener.open('http://www.example.com/login.html') as f: # # pass with request.urlopen(req, data=login_data.encode('utf-8')) as f: print('Status:', f.status, f.reason) for k, v in f.getheaders(): print('%s: %s' % (k, v)) print('Data:', f.read().decode('utf-8'))
4、XML虽然比JSON复杂,在web中使用的比以前少了,操作XML使用DOM或者SAX,DOM会把整个XML读入到内存当中,因此占用的内存较大,即系慢,但是优点是可以任意的遍历输的所有节点,SAX是流模式,边读边解析,占用的内存下。一般情况下先采用SAX 在python中解析XML通常只关心三个事件 start_element, end_element 和char_data 5、GitHub命令笔记整理 git config -l 查看当前git配置详细信息 查看不同级别的配置
查看用户信息
绑定你自己的信息
创建一个Git代码库文件
克隆远程仓库到自己电脑仓库
查看文件的状态是否改变
添加文件到暂存区
查看是否提交到仓库
将文件移除暂存区后,查看状态