首页 > 编程语言 > 详细

Python正则表达式,统计分析nginx访问日志

时间:2017-01-16 16:11:19      阅读:292      评论:0      收藏:0      [点我收藏+]

目标:

  1.正则表达式

  2.oop编程,统计nginx访问日志中不同IP地址出现的次数并排序

 

1.正则表达式

#!/usr/bin/env python
# -*- coding: utf-8 -*-


import re

# match
# 方法一
pattern1 = re.compile(rhello, re.I)

match = pattern1.match(Hello World)

if match:
    print match.group()

# 方法二

m = re.match(rhello, hello world.)

print m.group()

# search
pattern1 = re.compile(rWorld)

match = pattern1.search(Hello, hello World.)

if match:
    print match.group()


# split
pattern1 = re.compile(r\d+)
match = pattern1.split(one1two2three3)
print match
for i in match:
    print i

# findall
match = pattern1.findall(one1two2three3)
print match


# finditer
match = pattern1.finditer(one1two2three3)
for i in match:
    print i.group()

?运行代码,测试效果

 

2.oop编程,统计nginx访问日志中不同IP地址出现的次数并排序

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import re

class CountPatt(object):
    def __init__(self, patt):
        self.patt = re.compile(patt)
        self.result = {}
    def count_patt(self, fname):
        with open(fname) as fobj:
            for line in fobj:
                match = self.patt.search(line)
                if match:
                    key = match.group()
                    self.result[key] = self.result.get(key, 0) + 1

        return self.result

    def sort(self):
        result = []
        alist = self.result.items()
        for i in xrange(len(alist)):
            greater = alist[0]
            for item in alist[1:]:
                if greater[1] < item[1]:
                    greater = item
            result.append(greater)
            alist.remove(greater)
        return result


if __name__ == "__main__":
    httpd_log = /tmp/access.log
    ip_pattern = r^(\d+\.){3}\d+
    browser_pattern = rChrome|Safari|Firefox
    a = CountPatt(ip_pattern)
    print a.count_patt(httpd_log)
    print a.sort()

?运行代码,测试效果

handetiandeMacBook-Pro:test xkops$ python test2.py
{192.168.207.21: 25, 192.168.80.165: 20, 192.168.207.1: 46, 127.0.0.1: 10}
[(192.168.207.1, 46), (192.168.207.21, 25), (192.168.80.165, 20), (127.0.0.1, 10)]

 

Python正则表达式,统计分析nginx访问日志

原文:http://www.cnblogs.com/xkops/p/6289979.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!