纵有疾风起
人生不言弃

Python 提取邮件头基本信息

1 邮件内容

假设目前邮件名叫“1.txt”,邮件内容为:

From:   Justin-Bieber@entertain.org on behalf of BieberLeader [leader@hello.org]Sent:   2017-07-01 12:48To: 'staff@hello.org'; custom@hello.org;Willim Johnson; John SnowSubject:    The battlefield in WinterfellI have just met then. More details as soon as possible. So far, so good.Sent via iPhone 7 plus

2 提取思路

  • 要求把邮件头部信息提取出来,需要提取信息:
    • 发件人(From:)、发件时间(Sent)、收件人(To)、主题(Subject)
  • 初步提取信息所在行的内容即可。
  • 使用一个提取函数,把四个关键词放入数组中,用正则提取。
  • 四个信息都做了全局函数,如果曾经匹配过,则全局函数 + 1,以做标识。
  • 如果一个信息已经匹配过,而下一个信息还没匹配到,这一行的内容也需要读取出来。
  • 提取函数的返回值,如果是 None 则不做处理。
# coding: utf-8import refrom_count = 0sent_count = 0to_count = 0subject_count = 0def inspect_string(string):    global from_count    global sent_count    global to_count    global subject_count    keyword_list = ['From:', 'Sent:', 'To:', 'Subject:']    for keyword in keyword_list:        regex_str = ".*({0}.*)".format(keyword)        match_obj = re.match(regex_str, string)        if re.match(".*(From:.*)", string):            from_count += 1        if re.match(".*(Sent:.*)", string):            sent_count += 1        if re.match(".*(To:.*)", string):            to_count += 1        if re.match(".*(Subject:.*)", string):            subject_count += 1        if match_obj:            return match_obj.group(1)        if from_count > 0 and sent_count < 1:            return string        if sent_count > 0 and to_count < 1:            return string        if to_count > 0 and subject_count < 1:            return stringwith open('1.txt', 'rb') as f:    for line in f:        result = inspect_string(str(line))        if result is None:            continue        print(result)

3 运行结果

From:   Justin-Bieber@entertain.org on behalf of BieberLeader [leader@hello.org]Sent:   2017-07-01 12:48To: 'staff@hello.org'; custom@hello.org;Willim Johnson; John SnowSubject:    The battlefield in Winterfell

文章转载于:https://www.jianshu.com/p/11de9fc6a74d

原著是一个有趣的人,若有侵权,请通知删除

未经允许不得转载:起风网 » Python 提取邮件头基本信息
分享到: 生成海报

评论 抢沙发

评论前必须登录!

立即登录