前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >WAF防火墙数据接入腾讯云ES最佳实践(下)

WAF防火墙数据接入腾讯云ES最佳实践(下)

原创
作者头像
岳涛
发布2023-08-10 12:13:40
1K1
发布2023-08-10 12:13:40
举报
文章被收录于专栏:大数据生态大数据生态

说明

本文描述问题及解决方法适用于 腾讯云 Elasticsearch Service(ES)

另外使用到:腾讯云 Logstash

由于篇幅问题,本文会分两部分展开,上半部分请移步:WAF防火墙数据接入腾讯云ES最佳实践(上)

一、Logstash介绍

1. logstash插件说明

1.1?Filter配置

  • Grok

在数据处理中,我们用得最多的是 grok 插件,grok 是 logstash 最重要的插件。我们可以在 grok 里直接使用或应用预定义的表达式名称。

grok匹配模式语法为:%{SYNTAX:SEMANTIC:TYPE}

  • SYNTAX: 正则表达式、预定义的正则表达式名称
  • SEMANTIC: 标识符,标识匹配之后要放的字段名字(自定义或随心所欲,只要自己能认识区分的)
  • TYPE: 可选的类型,目前支持int、float

例如 NUMBER 可以匹配3.44,IP可以匹配 192.168.21.2

所以这上面这两个值可以用 %{NUMBER:duration} %{IP:client} 来匹配。

官方提供了很多可以直接使用的表达式,以下面这两个预定义 grok 表达式为例。

代码语言:javascript
复制
USERNAME [a-zA-Z0-9._-]+
USER %{USERNAME}

第一列是正则grok表达式的名称,可直接使用;第二列是普通的正则表达式第一行,用普通的正则表达式来定义一个 grok 表达式;第二行,简单来说,名字和表达式,可嵌套使用

Grok正则截取

grok正则表达式:

代码语言:javascript
复制
(?<temMsg>(.*)(?=Report)/?) 获取Report之前的字符作为temMsg字段的值 

grok正则表达式:

代码语言:javascript
复制
(?<temMsg>(?=Report)(.*)/?) 获取Report之后的字符作为temMsg字段的值
代码语言:javascript
复制
grok{
        match => { 
                 #截取<Report>之前的字符作为temMsg字段的值
                "message" => "(?<temMsg>(.*)(?=Report)/?)" 
            }
    }

这个是截取特定的字符集日志,要日志中包含了"Report"关键字,关键字根据实际替换即可。

注:表达式中(?=Report)中的等于"="符号如果换成"<="这表示就不包含本身了,例如:

代码语言:javascript
复制
(?<temMsg>(.*)(?=Report)/?) 可以写成 (?<temMsg>(.*)(?<=Report)/?)

这样输出的结果就不包含Report了,同理下面的一样。

其他样例:

grok正则表达式,截取report和msg之间的值,不包含report和msg本身:

代码语言:javascript
复制
(?<temMsg>(?<=report).*?(?=msg))

grok正则表达式,截取 包含report但不包含msg:

代码语言:javascript
复制
(?<temMsg>(report).*?(?=msg))

grok正则表达式,截取 不包含report但包含msg:

代码语言:javascript
复制
(?<temMsg>(?<=report).*?(msg))

grok正则表达式,输出以report开头,以msg或者以request结尾的所有包含头尾信息:

代码语言:javascript
复制
(?<temMsg>(report).*?(msg|request))

grok正则表达式,输出以report开头,以msg或者以request结尾的不包含头尾信息:

代码语言:javascript
复制
(?<temMsg>(report).*?(?=(msg|request)))
代码语言:javascript
复制
grok{
        match => { 
                 #截取<Report>之后的和<msg>之前的值作为temMsg字段的值
                "message" => "(?<temMsg>(?<=report).*?(?=msg))" 
            }
    }

这个是截取特定的字符集日志,要日志中包含了【report和msg和request】关键字

之间的表达式只要替换一下就可以使用了

(注:这个表达式中出现异常,在单个的字符串中可以将小括号【()】去掉,例如:(report).*?(?=msg) 可以写成report.*?(?=msg))

代码语言:javascript
复制
grok正则表达式:(?<MYELF>([\s\S]{500}))
 grok{
       match => {
              #截取日志500个字符 作为MYELF的值
              "message" => "(?<MYELF>([\s\S]{500}))"  
             }
     }

测试匹配规则我们可以使用 Kiabna 的Dev Tools里的Grok Debug来实现,如下图:

Demo举例

(1) 我们虚构的一个http请求日志:

代码语言:javascript
复制
114.114.114.114 GET /index.html 15824 0.043

可以使用如下grok pattern来匹配这种记录:

代码语言:javascript
复制
%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}

返回:

代码语言:javascript
复制
{
  "duration": "0.043",
  "request": "/index.html",
  "method": "GET",
  "bytes": "15824",
  "client": "114.114.114.114"
}

(2)我们想要得到下面日志里前面的时间 2020-03-18 以及 日志的级别 INFO

代码语言:javascript
复制
2020-03-18 14:04:23.944 [DubboServerHandler-10.50.245.25:63046-thread-168] INFO  c.f.l.d.LogTraceDubboProviderFilter - c79b0905-03c7-4e54-a5a6-ff1b34058cdf CALLEE_IN dubbo:EstatePriceService.listCellPrice

解决如下:

代码语言:javascript
复制
\s*%{TIMESTAMP_ISO8601:timestamp} \s*\[%{DATA:current_thread}\]\s*%{LOGLEVEL:loglevel}\s*(?<class_info>([\S+]*))

过滤结果显示如下:

代码语言:javascript
复制
{
? "current_thread": "DubboServerHandler-10.50.245.25:63046-thread-168",
? "loglevel": "INFO",
? "class_info": "c.f.l.d.LogTraceDubboProviderFilter",
? "timestamp": "2020-03-18 14:04:23.944"
}

(3)我们再看一看另外一个完整的nginx的access日志,样本文件如下,我们需要通过grok方式来输出json。

代码语言:javascript
复制
10.0.16.138 - - [18/Aug/2020:16:21:05 +0800] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36" "-"

?grok 语法规则如下:

代码语言:javascript
复制
^%{IPV4:remote_addr} - (%{USERNAME:remote_user}|-) \[%{HTTPDATE:time_local}\] \"%{WORD:method} %{DATA:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:status:int} %{NUMBER:body_sent:int} "%{NOTSPACE:http_user_agent}\" %{QS:http_referer} %{NOTSPACE:http_user_agent}

返回内容如下:

代码语言:javascript
复制
{
  "remote_addr": "10.0.16.138",
  "request": "/",
  "method": "GET",
  "time_local": "18/Aug/2020:16:21:05 +0800",
  "http_user_agent": "-",
  "remote_user": "-",
  "http_referer": "\"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36\"",
  "httpversion": "1.1",
  "body_sent": 0,
  "status": 304
}

(4)Java日志内容

代码语言:javascript
复制
2020-12-04 14:16:30.003  INFO 19095 --- [pool-4-thread-1] com.yck.laochangzhang.task.EndorseTask   : 平台发起结束

Grok?配置如下:

代码语言:javascript
复制
(?<datetime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\.\d{3})\s+(?<level>\w+)\s+\d+\s+-+\s\[[^\[\]]+\]\s+(?<handler>\S+)\s+:(?<msg>.*)

返回内容如下:

代码语言:javascript
复制
{
  "msg": "平台发起结束"
  "handler":"com.yck.laochangzhang.task.EndorseTask""datetime": "2020-12-84 14:16:38.883"
  "level":"INFO"
}

二、需求实现

1.?DLP日志接入

原始日志(已经过全量严格脱敏):

代码语言:javascript
复制
2022-07-21T07:15:13.150Z 10.13.1.74 <6>Jul 21 15:15:34 [主机名] CEF:0|SkyGuard|Data Security|[ID]|DLP Incident Syslog| serialId=[ID] transactionId=[替换transactionId] incidentTime=2022-07-21 14:57:23 detectTime=2022-07-21 14:55:27 policyGroupNames=默认策略组 policyNames=[替换]-客户数据-信息 statusType=新 severityType=信息 maxMatches=16 transactionSize=116.87 KB detectAgent=Endpoint(MacBook-Pro.local) analyzeAgent=Content Analysis Engine(MacBook-Pro.local) tagContent=N/A ignoreStatus=未忽略 breachContents=[邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址]; [邮箱地址] totalMatches=16 classificationNames=N/A classificationLevelNames=N/A tagNames=N/A policyUuid=[替换policyUuid] ruleUuid=[替换ruleUuid] ruleName=邮件地址(16) sourceName=[替换姓名]-seti destinationNames=[域名] channelType=HTTP actionType=放行 attachmentNames=N/A details=http://[域名]/function/proto/json-to-proto-json-to-proto/proxy workModeType=阻断 sourceGroups=N/A sourceOu=[替换OU],OU=[替换OU],OU=[替换OU],OU=[替换OU],OU=[替换OU],DC=[替换DC],DC=[替换DC] sourceIp=[替换IP],[替换IP] destinationIps=[替换IP] sourceMail=[邮箱地址] sourceManagers=N/A sourceDepartment=N/A urlCategories=N/A destinationCountries=N/A destinationCities=N/A destinationLocations=N/A riskLevel=N/A sourceTitle=N/A sourceMobile=N/A sourceTelephone=N/A sourceEmployeeID=N/A sourceOffice=N/A sourceUuid=[替换ID] sourceLogonName=seti sourceDomain=N/A destinationUuid=N/A destinationLogonName=N/A destinationMail=N/A destinationDomain=[域名] messageId=N/A deviceType=笔记本 corporateType=公司内部 operationSystem=macOS hostname=[主机名] dmDeviceName=sec-dlp-lite-1 sourceMacAddress=[MAC地址] dataMethod=N/A

filter 规则如下:

代码语言:javascript
复制
filter{
    grok {
      match => {
        "message" => "%{TIMESTAMP_ISO8601:messageTime} %{IP:clientIp} (?<logTime>\<.*?\>%{MONTH}%{SPACE}%{MONTHDAY}%{SPACE}%{TIME}) (?<logName>.*?\-.*?\-.*?\-.*?\-.*?) (?<logTag>.*?\|.*?\|.*?\|.*?\|.*?\|) %{GREEDYDATA:logContents}"
      }
    }

    # 移除内容里的空格
    mutate { 
        gsub => ["logContents", "; ", ";"]
    }

    # 解析kv
    kv {
        source => "logContents"
        field_split => " "
        value_split => "="
    }

    # 还原内容里的空格
    mutate { 
        gsub => ["breachContents", ";", "; "]
    }

    # 匹配ISO8601
    date {
        match => ["messageTime", "ISO8601"]
        target => "@timestamp"
    }

    # 移除字段
    mutate {
        remove_field => ["logContents", "message"]
    }
}

2. 网络认证日志接入

原始日志(已经过全量严格脱敏):

代码语言:javascript
复制
{
  "severity": 0,
  "priority": 0,
  "facility_label": "kernel",
  "type": "clearpass_syslog",
  "host": "[脱敏后的IP地址]",
  "severity_label": "Emergency",
  "message": "<143>Nov 29 2022 11:23:56 [脱敏后的IP地址] CEF:0|Aruba Networks|ClearPass|6.10.6.186545|1000|RADIUS Authentications|1|cat=Insight Logs dvc=[脱敏后的IP地址] duser=[脱敏后的用户名] dmac=[脱敏后的MAC地址] cs2=[脱敏后的值] cs2Label=Authentication Protocol src=[脱敏后的IP地址] cs4=[脱敏后的值] cs4Label=Login Status destinationServiceName=[脱敏后的值] cs3=[脱敏后的值] cs3Label=Authentication Source dpriv=[脱敏后的值]|[脱敏后的值] cs5=[脱敏后的值] cs5Label=Enforcement Profiles\n",
  "facility": 0
}

filter 规则如下:

代码语言:javascript
复制
filter {
    grok {
        match => {
          "message" => "<%{NUMBER:message_type_id}>(?<message_time>%{MONTH}%{SPACE}%{MONTHDAY}%{SPACE}%{YEAR}%{SPACE}%{TIME})%{SPACE}%{IP:message_ip}%{SPACE}(?<message_cef>.*?\|)(?<networks>.*?\|)(?<message_type>.*?\|)(?<message_info>.*?\|)(?<message_number>.*?\|)(?<message_logged>.*?\|)(?<message_flag>.*?\|)%{GREEDYDATA:logContents}"
        }
    }

    # 移除外层分隔符
    mutate {
        gsub => ["message_cef", "\|", "",
                 "message_number", "\|", "",
                 "message_type", "\|", "",
                 "networks", "\|", "",
                 "message_info", "\|", "",
                 "message_flag", "\|", "",
                 "message_logged", "\|", ""
                ]
    }

    # 移除末尾回车
    mutate {
        gsub => ["logContents", "\\n", ""]
    }

    # 解析kv
    kv {
        source => "logContents"
        field_split => " "
        value_split => "="
    }

    # 补全因空格分隔符造成value丢失的数据
    mutate {
        gsub => ["cat", "Session", "Session Logs",
                 "cat", "Insight", "Insight Logs",
                 "cs1Label", "Login", "Login Status",
                 "cs2Label", "Authentication", "Authentication Protocol",
                 "cs4Label", "Login", "Login Status",
                 "cs3Label", "Authentication", "Authentication Source",
                 "cs5Label", "Enforcement", "Enforcement Profiles",
                 "dpriv", "\[Guest\]\|\[User", "Guest|User Authenticated"
                ]
    }

    # 移除字段
    mutate {
        remove_field => ["severity","severity_label","type","priority","host","facility","facility_label"]
    }
}

3. 终端日志接入

原始日志(已经过全量严格脱敏):

代码语言:javascript
复制
2022-06-06T00:56:25.042Z [脱敏] <6>2022-06-06T08:56:25+08:00 [脱敏] HRESS[2100]: {"center_version":"1.0.37.0","event":{"time":1654476985,"client_id":3412,"hostname":"[脱敏]","group_name":"[脱敏]","class":1,"rule_name":"IE首页项","action":4,"proc_path":"C:\Program Files (x86)\Tencent\QQPCMgr\13.5.20525.234\QQPCTray.exe","cmdline":""C:\Program Files (x86)\Tencent\QQPCMgr\13.5.20525.234\QQPCTray.exe" /elevated /regrun","res_path":"HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main\Start Page","res_val":"https://hao.qq.com/?unc=Af31026\u0026s=o400493_1","treatment":131075},"event_type":"系统加固"}
2022-06-07T11:03:25.000Z [脱敏] {"center_version":"1.0.37.0","event":{"time":1654599805,"client_id":71,"hostname":"[脱敏]","group_name":"[脱敏]","class":"rogue","domain":"[脱敏]"},"event_type":"恶意网站拦截"}
2022-06-07T10:57:25.000Z [脱敏] {"center_version":"1.0.37.0","event":{"time":1654599445,"client_id":1835,"hostname":"[脱敏]","group_name":"[脱敏]","software_name":"搜狗输入法","proc_path":"C:\Program Files\Google\Chrome\Application\chrome.exe","file_path":"C:\Users\linsu\Downloads\sogou_pinyin_121b.exe","treatment":0},"event_type":"软件安装拦截"}
2022-06-08T01:34:25.000Z [脱敏] {"center_version":"1.0.37.0","event":{"time":1654652065,"client_id":3178,"hostname":"[脱敏]","group_name":"[脱敏]","kbid":"5005260","desc":"现已确认 Microsoft 软件产品中存在可能会影响您的系统的安全问题。您可以通过安装本 Microsoft 更新程序来保护您的系统不受侵害。有关本更新程序中所含问题的完整列表,请参阅相关 Microsoft 知识库文章。安装本更新程序之后,可能需要重新启动系统。","level":0,"state":7},"event_type":"漏洞修复"}
2022-06-08T01:50:25.000Z [脱敏] {"center_version":"1.0.37.0","event":{"time":1654653025,"client_id":2925,"hostname":"[脱敏]","group_name":"[脱敏]","software_name":"360画报","proc_path":"C:\360浏览器\360se6\Application\components\sesvc\sesvc.exe","file_path":"C:\360浏览器\360se6\Application\components\huabao\13.0.42.0\360hb_inst.exe","treatment":3},"event_type":"软件安装拦截"}

filter 规则如下:

代码语言:javascript
复制
filter {
  ruby {
    code => 'event.set("space_num", event.get("message").split("{")[0].count(" "))'
  }


  if [space_num] == 5 {
    grok {
      match => {"message" => "%{TIMESTAMP_ISO8601:message_time} %{IP:client_ip} (\<.*?\>%{TIMESTAMP_ISO8601:data_time}) (%{DATA:logsource}) %{DATA:program} %{GREEDYDATA:HRESS2100}"}
    }
  }

  if [space_num] == 2 {
    grok {
      match => {"message" => "%{TIMESTAMP_ISO8601:message_time} %{IP:client_ip} %{GREEDYDATA:HRESS2100}"}
    }
  }

  if [space_num] == 0 {
    grok {
      match => {"message" => "%{GREEDYDATA:HRESS2100}"}
    }
  }

  json {
    source => "HRESS2100"
  }

  mutate {
    add_field => {"@event" => "%{event}"}
  }

  json {
    source => "@event"  #再进行解析
  }

  mutate {
    # 修改字段类型
    convert => ["time", "string"]
    convert => ["client_id", "string"]
    convert => ["priority", "string"] 
    convert => ["severity", "string"] 
    convert => ["facility", "string"]
  }

  date {
    match => ["time", "UNIX"]
    target => "@timestamp"
  }

  # 移除字段
  mutate {
    remove_field => [ "HRESS2100" , "message", "@event", "event", "space_num"]
  }
}

附录

官方提供的预定义 grok 表达式

代码语言:javascript
复制
USERNAME [a-zA-Z0-9._-]+
USER %{USERNAME}
EMAILLOCALPART [a-zA-Z][a-zA-Z0-9_.+-=:]+
EMAILADDRESS %{EMAILLOCALPART}@%{HOSTNAME}
INT (?:[+-]?(?:[0-9]+))
BASE10NUM (?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))
NUMBER (?:%{BASE10NUM})
BASE16NUM (?<![0-9A-Fa-f])(?:[+-]?(?:0x)?(?:[0-9A-Fa-f]+))
BASE16FLOAT \b(?<![0-9A-Fa-f.])(?:[+-]?(?:0x)?(?:(?:[0-9A-Fa-f]+(?:\.[0-9A-Fa-f]*)?)|(?:\.[0-9A-Fa-f]+)))\b

POSINT \b(?:[1-9][0-9]*)\b
NONNEGINT \b(?:[0-9]+)\b
WORD \b\w+\b
NOTSPACE \S+
SPACE \s*
DATA .*?
GREEDYDATA .*
QUOTEDSTRING (?>(?<!\\)(?>"(?>\\.|[^\\"]+)+"|""|(?>'(?>\\.|[^\\']+)+')|''|(?>`(?>\\.|[^\\`]+)+`)|``))
UUID [A-Fa-f0-9]{8}-(?:[A-Fa-f0-9]{4}-){3}[A-Fa-f0-9]{12}
# URN, allowing use of RFC 2141 section 2.3 reserved characters
URN urn:[0-9A-Za-z][0-9A-Za-z-]{0,31}:(?:%[0-9a-fA-F]{2}|[0-9A-Za-z()+,.:=@;$_!*'/?#-])+

# Networking
MAC (?:%{CISCOMAC}|%{WINDOWSMAC}|%{COMMONMAC})
CISCOMAC (?:(?:[A-Fa-f0-9]{4}\.){2}[A-Fa-f0-9]{4})
WINDOWSMAC (?:(?:[A-Fa-f0-9]{2}-){5}[A-Fa-f0-9]{2})
COMMONMAC (?:(?:[A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2})
IPV6 ((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?
IPV4 (?<![0-9])(?:(?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(?![0-9])
IP (?:%{IPV6}|%{IPV4})
HOSTNAME \b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b)
IPORHOST (?:%{IP}|%{HOSTNAME})
HOSTPORT %{IPORHOST}:%{POSINT}

# paths
PATH (?:%{UNIXPATH}|%{WINPATH})
UNIXPATH (/([\w_%!$@:.,+~-]+|\\.)*)+
TTY (?:/dev/(pts|tty([pq])?)(\w+)?/?(?:[0-9]+))
WINPATH (?>[A-Za-z]+:|\\)(?:\\[^\\?*]*)+
URIPROTO [A-Za-z]([A-Za-z0-9+\-.]+)+
URIHOST %{IPORHOST}(?::%{POSINT:port})?
# uripath comes loosely from RFC1738, but mostly from what Firefox
# doesn't turn into %XX
URIPATH (?:/[A-Za-z0-9$.+!*'(){},~:;=@#%&_\-]*)+
#URIPARAM \?(?:[A-Za-z0-9]+(?:=(?:[^&]*))?(?:&(?:[A-Za-z0-9]+(?:=(?:[^&]*))?)?)*)?
URIPARAM \?[A-Za-z0-9$.+!*'|(){},~@#%&/=:;_?\-\[\]<>]*
URIPATHPARAM %{URIPATH}(?:%{URIPARAM})?
URI %{URIPROTO}://(?:%{USER}(?::[^@]*)?@)?(?:%{URIHOST})?(?:%{URIPATHPARAM})?

# Months: January, Feb, 3, 03, 12, December
MONTH \b(?:[Jj]an(?:uary|uar)?|[Ff]eb(?:ruary|ruar)?|[Mm](?:a|?)?r(?:ch|z)?|[Aa]pr(?:il)?|[Mm]a(?:y|i)?|[Jj]un(?:e|i)?|[Jj]ul(?:y)?|[Aa]ug(?:ust)?|[Ss]ep(?:tember)?|[Oo](?:c|k)?t(?:ober)?|[Nn]ov(?:ember)?|[Dd]e(?:c|z)(?:ember)?)\b
MONTHNUM (?:0?[1-9]|1[0-2])
MONTHNUM2 (?:0[1-9]|1[0-2])
MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])

# Days: Monday, Tue, Thu, etc...
DAY (?:Mon(?:day)?|Tue(?:sday)?|Wed(?:nesday)?|Thu(?:rsday)?|Fri(?:day)?|Sat(?:urday)?|Sun(?:day)?)

# Years?
YEAR (?>\d\d){1,2}
HOUR (?:2[0123]|[01]?[0-9])
MINUTE (?:[0-5][0-9])
# '60' is a leap second in most time standards and thus is valid.
SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
# datestamp is YYYY/MM/DD-HH:MM:SS.UUUU (or something like it)
DATE_US %{MONTHNUM}[/-]%{MONTHDAY}[/-]%{YEAR}
DATE_EU %{MONTHDAY}[./-]%{MONTHNUM}[./-]%{YEAR}
ISO8601_TIMEZONE (?:Z|[+-]%{HOUR}(?::?%{MINUTE}))
ISO8601_SECOND (?:%{SECOND}|60)
TIMESTAMP_ISO8601 %{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?
DATE %{DATE_US}|%{DATE_EU}
DATESTAMP %{DATE}[- ]%{TIME}
TZ (?:[APMCE][SD]T|UTC)
DATESTAMP_RFC822 %{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} %{TZ}
DATESTAMP_RFC2822 %{DAY}, %{MONTHDAY} %{MONTH} %{YEAR} %{TIME} %{ISO8601_TIMEZONE}
DATESTAMP_OTHER %{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{TZ} %{YEAR}
DATESTAMP_EVENTLOG %{YEAR}%{MONTHNUM2}%{MONTHDAY}%{HOUR}%{MINUTE}%{SECOND}

# Syslog Dates: Month Day HH:MM:SS
SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME}
PROG [\x21-\x5a\x5c\x5e-\x7e]+
SYSLOGPROG %{PROG:program}(?:\[%{POSINT:pid}\])?
SYSLOGHOST %{IPORHOST}
SYSLOGFACILITY <%{NONNEGINT:facility}.%{NONNEGINT:priority}>
HTTPDATE %{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME} %{INT}

# Shortcuts
QS %{QUOTEDSTRING}

# Log formats
SYSLOGBASE %{SYSLOGTIMESTAMP:timestamp} (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:logsource} %{SYSLOGPROG}:

# Log Levels
LOGLEVEL ([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)    

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 说明
  • 一、Logstash介绍
    • 1. logstash插件说明
      • 1.1?Filter配置
  • 二、需求实现
    • 1.?DLP日志接入
      • 2. 网络认证日志接入
        • 3. 终端日志接入
        • 附录
        相关产品与服务
        Elasticsearch Service
        腾讯云 Elasticsearch Service(ES)是云端全托管海量数据检索分析服务,拥有高性能自研内核,集成X-Pack。ES 支持通过自治索引、存算分离、集群巡检等特性轻松管理集群,也支持免运维、自动弹性、按需使用的 Serverless 模式。使用 ES 您可以高效构建信息检索、日志分析、运维监控等服务,它独特的向量检索还可助您构建基于语义、图像的AI深度应用。
        领券
        问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档
        http://www.vxiaotou.com