V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
jack139
V2EX  ›  Python

12306 登录中 js 动态密码的分析和 python 实现

  •  
  •   jack139 ·
    jack139 · 2015-07-21 22:28:21 +08:00 · 4918 次点击
    这是一个创建于 3397 天前的主题,其中的信息可能已经有所发展或是发生改变。
    最近搞了一阵12306,打码的问题就不说了,那个现在已经有很多解决方法了,说说比较有趣的js动态密码。这是我给起的名字,不知道人家原本叫什么。大概的过程就是,在登录前,需要先获得一串动态加密串,与登录请求和验证码结果一起提交,相当于多加了一层终端用户看不见的保护。目的很明确,给我们这些想写个段子自动出票的人设置障碍。其实我感觉,12306是想实现类似客户端证书那样的密钥机制,不过遗憾的是,因为单纯网页机制造成的,只能在js上作文章。12306也是满拼的,搞出一个动态js,不过遗憾的是,所谓动态,只是js文件名和密钥本身是动态变化的,js里的代码是一尘不变的。好了,你懂得了。下面来看代码,我是用python重写了一下js加密的部分,生产系统对并发要求很高,所以用urllib3实现http链接,因为这个module支持线程重入。下面只给出获得动态js、分析得出动态密码的部分。其他的吗?登录、打码、出票?自己去分析http交互吧,那些都是体力活儿了。(推荐下,fiddler2是个不二的工具)

    httphelper3.py
    [python] view plaincopy
    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    #
    #
    import socket, urllib, urllib3
    import dynamicJS

    # 返回结果
    E_OK = 0
    E_QUERY = -1
    E_DATA = -2

    #
    # ----------------- define about connection ---------------------------------------------
    #

    CONN_TIMEOUT = 180

    socket.setdefaulttimeout(CONN_TIMEOUT)

    urllib3.disable_warnings()

    # cookie pool
    cookie_pool = {}

    # connection pool
    conn_pool = {}

    user_agent = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36'


    #
    # ----------------- HTTP GET/POST & Cookie ---------------------------------------------
    #

    def new_cookie(pool_id, name, value): # 添加新cookie
    global cookie_pool
    if cookie_pool.has_key(pool_id):
    cookie_pool[pool_id][name] = value
    else:
    cookie_pool[pool_id] = { name : value }

    def get_cookie(pool_id):
    if cookie_pool.has_key(pool_id):
    return cookie_pool[pool_id]
    else:
    return {}

    def set_cookie(pool_id, c):
    global cookie_pool
    if c==None:
    cookie_pool[pool_id]={}
    else:
    cookie_pool[pool_id]=c

    def clear_cookie(pool_id):
    set_cookie(pool_id, None)

    def remove_session_cookie(pool_id):
    global cookie_pool
    if cookie_pool.has_key(pool_id):
    cookie_pool[pool_id].pop('BIGipServerotn', None)
    cookie_pool[pool_id].pop('JSESSIONID', None)

    def new_pool(pool_id):
    global conn_pool
    pool = urllib3.PoolManager(num_pools=50, timeout=CONN_TIMEOUT, retries=False)
    conn_pool[pool_id]=pool
    return pool

    def get_pool(pool_id):
    if conn_pool.has_key(pool_id):
    return conn_pool[pool_id]
    else:
    print 'get_pool(): %s not found!' % pool_id
    return None

    def set_todo(pool_id, new_cookie=None):
    # 添加链接,如果不存在
    if not conn_pool.has_key(pool_id):
    new_pool(pool_id)
    # 设置 cookie
    set_cookie(pool_id, new_cookie)
    print 'set_todo(): %s - %s' % (pool_id, conn_pool[pool_id].proxy.host)
    return get_pool(pool_id)

    def close_pool(pool_id):
    global conn_pool
    # 清除连接
    if conn_pool.has_key(pool_id):
    conn_pool.pop(pool_id, None)
    # 清除cookie
    clear_cookie(pool_id)

    def http_header(pool_id, host=None, origin=None, refer=None, more=None, isPOST=True):
    header={}
    header['Connection'] = 'keep-alive'
    header['Accept'] = '*/*'
    header['Accept-Language'] = 'zh-CN,zh;q=0.8'
    header['Accept-Encoding'] = 'gzip,deflate'
    header['User-Agent'] = user_agent
    if isPOST:
    header['Content-Type'] = 'application/x-www-form-urlencoded; charset=UTF-8'
    if more!=None:
    for h in more:
    header[h[0]] = h[1]
    if host!=None:
    header['Host'] = host
    if origin!=None:
    header['Origin'] = origin
    if refer!=None:
    header['Referer'] = refer
    if len(cookie_pool[pool_id])>0:
    header['Cookie'] = '; '.join('%s=%s' % (k,v) for (k,v) in cookie_pool[pool_id].items())
    return header

    def http_do_request(pool_id, method, url, header, body=None):
    #print body
    try:
    pool = get_pool(pool_id)
    #print pool, method, url, header
    r = pool.urlopen(method, url, headers=header, body=body)

    # 处理 set-cookie
    if 'set-cookie' in r.headers.keys():
    global cookie_pool
    l = r.headers['set-cookie'].split(',')
    for i in l:
    t = i.split(';')[0].split('=')
    if len(t)==2:
    # cookie变量里有逗号!!! 要避免!
    cookie_pool[pool_id][t[0].strip()] = t[1].strip()

    if r.status<500: #r.status==200 or r.status==405:
    return r.data
    else:
    print 'HTTP ERROR: ', r.status, url
    return None

    except Exception,e:
    print '%s: %s (%s)' % (type(e), e, url)
    return None

    def http_get(pool_id, url, host=None, origin=None, refer=None, more=None): #
    # GET
    print url
    header = http_header(pool_id, host, origin, refer, more, isPOST=False)
    return http_do_request(pool_id, 'GET', url, header)

    def http_post(pool_id, url, para, host=None, origin=None, refer=None, more=None, json=True): # para 是字典格式的参数(json=False)
    # POST
    if json:
    data = para
    else:
    data = '&'.join(['%s=%s' % (str(k),str(v)) if v!=None else str(k) for (k,v) in para.items()])
    print url
    print para
    header = http_header(pool_id, host, origin, refer, more)
    header['X-Requested-With'] = 'XMLHttpRequest'
    return http_do_request(pool_id, 'POST', url, header, data)

    #
    # --------------- API to 12306 --------------------------------------------------
    #


    # 返回变量赋值中的值,类似"var abc={'a':1}"
    find_start={}
    def get_content(pool_id, whole_str, var_name, end_char=';', split_char='=', add_head='', add_tail='', need_replace=False, no_eval=False):
    global find_start

    b = whole_str.find(var_name)
    if b==-1:
    print 'get_content() fail: var_name = %s' % var_name
    return None
    c=whole_str[b:].find(end_char)
    if c==-1:
    print 'get_content() fail: var_name = %s, end_char = %s' % (var_name, end_char)
    return None
    d=whole_str[b:b+c].split(split_char)
    find_start[pool_id] = b+c
    if len(d)!=2:
    print whole_str[b:b+c]
    print 'get_content() fail: var_name = %s, split_char = %s' % (var_name, split_char)
    return None

    e=add_head+d[1]+add_tail
    if no_eval: # 对字符串,可不是用eval
    return e[1:-1]
    try:
    if need_replace:
    return eval(e.replace('null','None').replace('true','True').replace('false','False'))
    else:
    return eval(e)
    except SyntaxError:
    print 'get_content() SyntaxError: var_name = %s, d = %s' % (var_name, str(d))
    return None


    # 取得动态加密参数
    # 0 - login
    # 1 - leftTicket
    #

    page_url = [
    {
    'url' : 'https://kyfw.12306.cn/otn/login/init',
    'host' : 'kyfw.12306.cn',
    'refer' : 'https://kyfw.12306.cn/otn/'
    },
    {
    'url' : 'https://kyfw.12306.cn/otn/leftTicket/init',
    'host' : 'kyfw.12306.cn',
    'refer' : 'https://kyfw.12306.cn/otn/index/init'
    },
    ]

    def get_dynamic_key_from_js(pool_id, js_url, submit_token=None):
    # GET 取得动态js
    print 'get_dynamic_key_from_js(%s)' % js_url

    data = http_get(pool_id, 'https://kyfw.12306.cn'+js_url, host='kyfw.12306.cn', refer='https://kyfw.12306.cn/otn/login/init')
    if data==None:
    return (E_QUERY, 'query no return from js')

    ready_start = data.find('ready(function()')
    if ready_start==-1:
    return (E_DATA, 'ready function not found')

    js_url0 = get_content(pool_id, data[ready_start:], 'url :\'/otn/dynamicJs/', split_char=':', end_char=',', no_eval=True)
    if js_url0!=None:
    print js_url0
    if submit_token==None:
    http_post(pool_id, 'https://kyfw.12306.cn'+js_url0, None,
    host='kyfw.12306.cn', refer='https://kyfw.12306.cn/otn/login/init',
    more=[('Content-Length', '0')])
    else:
    http_post(pool_id, 'https://kyfw.12306.cn'+js_url0, '_json_att=&REPEAT_SUBMIT_TOKEN=%s' % submit_token,
    host='kyfw.12306.cn', refer='https://kyfw.12306.cn/otn/confirmPassenger/initDc')

    key = get_content(pool_id, data, 'function gc(){', end_char=';', no_eval=True)
    #print key
    if key==None:
    return (E_DATA, 'key not found')
    else:
    return ( E_OK, ( key, urllib.quote_plus(dynamicJS.encrypt1('1111', key)) ) )

    def get_dynamic_key(pool_id, page):
    # GET
    print 'get_dynamic_key(%d)' % page

    # 取得动态js的url
    data = http_get(pool_id, page_url[page]['url'], host=page_url[page]['host'], refer=page_url[page]['refer'])
    if data==None:
    return (E_QUERY, 'query no return')

    js_url = get_content(pool_id, data, 'src="/otn/dynamicJs', end_char=' type', no_eval=True)
    #print js_url
    if js_url==None:
    return (E_DATA, 'dynamic JS not found')
    else:
    return get_dynamic_key_from_js(pool_id, js_url)

    dynamicJS.py
    [python] view plaincopy
    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    #

    import math, urllib

    def bin216(s1):
    s=str(s1)
    o = ''
    for i in xrange(len(s)):
    b = ord(s[i])
    n = '%02x' % b
    o += n
    return o


    delta = 0x9E3779B8;

    def longArrayToString(data, includeLength):
    length = len(data)
    n = (length - 1) << 2

    if includeLength:
    m = data[length - 1];
    if (m < n - 3) or (m > n):
    return None;
    n = m;

    for i in xrange(length):
    # 无符号右移 js: -1 >>> 1, python: (-1 & 0xffffffff) >> 1
    data[i] = chr(data[i] & 0xff) \
    + chr((data[i] & 0xffffffff) >> 8 & 0xff) \
    + chr((data[i] & 0xffffffff) >> 16 & 0xff) \
    + chr((data[i] & 0xffffffff) >> 24 & 0xff)

    if includeLength:
    return ''.join(x for x in data)[0:n]
    else:
    return ''.join(x for x in data)

    def stringToLongArray(string1, includeLength):
    length = len(string1)
    result = []
    for i in xrange(0,length,4):
    result.append(ord(string1[i]) \
    | ord(string1[i + 1]) << 8 \
    | ord(string1[i + 2]) << 16 \
    | ord(string1[i + 3]) << 24)

    if includeLength:
    result.append(length)

    return result

    def encrypt(string1, key):
    if string1 == '':
    return ''

    v = stringToLongArray(string1, True);
    k = stringToLongArray(key, False);

    if len(k) < 4:
    k += [0]*(4-len(k)) # 填充 0

    n = len(v) - 1;
    z = v[n]
    y = v[0]
    q = int(math.floor(6 + 52 / (n + 1)))
    sum1 = 0;

    while 0 < q:
    q -= 1
    sum1 = sum1 + delta & 0xffffffff
    e = (sum1 & 0xffffffff) >> 2 & 3

    for p in xrange(n):
    y = v[p + 1]
    mx = ((z & 0xffffffff) >> 5 ^ y << 2) \
    + ((y & 0xffffffff) >> 3 ^ z << 4) ^ (sum1 ^ y) \
    + (k[p & 3 ^ e] ^ z)
    z = v[p] = v[p] + mx & 0xffffffff;
    p += 1
    y = v[0]
    mx = ((z & 0xffffffff) >> 5 ^ y << 2) \
    + ((y & 0xffffffff) >> 3 ^ z << 4) ^ (sum1 ^ y) \
    + (k[p & 3 ^ e] ^ z)
    z = v[n] = v[n] + mx & 0xffffffff

    return longArrayToString(v, False)

    keyStr = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=";

    def encode32(input0):
    input1 = urllib.quote_plus(input0)
    input1 += '\0'*((3-len(input1)%3)%3)
    output = ''
    i = 0
    while 1:
    chr1 = ord(input1[i])
    chr2 = ord(input1[i+1])
    chr3 = ord(input1[i+2])
    i += 3
    enc1 = chr1 >> 2;
    enc2 = ((chr1 & 3) << 4) | (chr2 >> 4)
    enc3 = ((chr2 & 15) << 2) | (chr3 >> 6)
    enc4 = chr3 & 63
    if chr2==0:
    enc3 = enc4 = 64
    elif chr3==0:
    enc4 = 64
    output = output + keyStr[enc1] + keyStr[enc2] + keyStr[enc3] + keyStr[enc4]

    if i >= len(input1):
    break

    return output

    def encrypt1(string1, key):
    return encode32(bin216(encrypt(string1, key)))

    测试代码:(可以在pyton命令行里测试)
    [plain] view plaincopy
    [root@takit]# python
    Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36)
    [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import httphelper3
    >>> httphelper3.set_todo('test')
    set_todo(): test - 10.xxx.xxx.xxx
    <urllib3.poolmanager.ProxyManager object at 0x7f0b520b3050>
    >>> httphelper3.clear_cookie('test')
    >>> httphelper3.new_cookie('test','current_captcha_type','Z')
    >>> ret, dynamic_key=httphelper3.get_dynamic_key('test',0)
    get_dynamic_key(0)
    https://kyfw.12306.cn/otn/login/init
    get_dynamic_key_from_js(/otn/dynamicJs/lgotysx)
    https://kyfw.12306.cn/otn/dynamicJs/lgotysx
    get_content() fail: var_name = url :'/otn/dynamicJs/
    >>> dynamic_key
    ('MTE5NDIx', 'ZDUxMTRhODJkMjMzOTQyYQ%3D%3D')

    好了,现在拿着这两个加密串,还有验证码打码结果,就可以高高兴兴的登录12306了。

    ps. 话说回来,12306的js代码里还是留了一些梗的,哪天翻脸还会有变化的,不过套路大概差不多。如果12306有变化别埋怨是我贴了这篇文章导致的哦,这样太抬举我了。另外,你一定会问打码肿么办?答案只有一个:人工打码。你不会闲到自己打码吧?有很多打码平台可以......嘘~~~
    5 条回复    2015-07-23 09:26:43 +08:00
    jack139
        1
    jack139  
    OP
       2015-07-21 22:31:30 +08:00
    搞不懂,一点格式都没有啊
    wind4
        2
    wind4  
       2015-07-22 10:45:36 +08:00
    你需要贴github或gist
    Shazoo
        3
    Shazoo  
       2015-07-22 20:00:52 +08:00
    额,这种无法用phantomjs来搞定吗?一定得自己实现算法?
    jack139
        4
    jack139  
    OP
       2015-07-23 00:06:57 +08:00
    @Shazoo phantomjs 可以,但效率太差,用python可以1小时出1000张票
    Shazoo
        5
    Shazoo  
       2015-07-23 09:26:43 +08:00
    @jack139 收到。
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   5302 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 28ms · UTC 05:56 · PVG 13:56 · LAX 21:56 · JFK 00:56
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.