15158846557 在线咨询 在线咨询
15158846557 在线咨询
所在位置: 首页 > 营销资讯 > 网站运营 > 使用Excel+VBA对网页进行操作

使用Excel+VBA对网页进行操作

时间:2023-07-05 00:18:01 | 来源:网站运营

时间:2023-07-05 00:18:01 来源:网站运营

使用Excel+VBA对网页进行操作:因为在本站的一些答案,最近总有私信问我如何使用VBA网抓的,我基本都没有回复。因为这个问题太大了,对于有基础的人来说,自己百度或者上ExcelHome论坛其实很容易找到答案,并不需要我说什么,而对于没有基础的人来说,三言两语不可能解决问题,我也不想把私信变成聊天窗。借着本站开放专栏的机会,正好来仔细交代一下这个问题。

对于Excel和VBA我所知有限,仅能解决自己遇到的一些问题,并不一定适用于所有场景。以下内容建立在了解基本VBA使用以及HTML语言知识的基础上:


一、前期准备

就我所知,VBA并不能操作任意浏览器及网页,我们所能做的仅仅是对IE进行一些操作,是的,仅仅是IE。不要告诉我电脑上没有IE,那样就可以Exit Sub了。就像Python用import、C#用using一样,VBA也需要引用一些库才能对IE进行操作,不过好在同属微软产品,所以我们能很简便的利用VBA自带的一些库。

首先我们要做的就是在VBA中引用Micorsoft Internet Controls,看这个名字就知道是帮助我们控制IE页面用的。

二、网页操作

引用Micorsoft Internet Controls之后,我们就可以对页面为所欲为了,不过首页我们要有个页面,上帝说要有页面!

1、打开网页

我们以在百度搜索“扯乎”关键词为例:

With CreateObject("internetexplorer.application") .Visible = True .Navigate "https://www.baidu.com/s?wd=扯乎"'关闭网页' .Quit End With代码很简单,先创建一个IE对象,然后给一些属性赋值。Visible是可见性,说的是在对网页进行操作时,这个网页是不是会被看见。熟练之后可以设置为False,不仅让程序在跑的时候有种神秘感(并没有),还能稍微加快一点速度。

不过有一点要记住,这个网页我们打开之后并没有关闭,也就是说程序结束后需要手动关闭,如果网页不可见是无法手动关闭的。代码中注释的部分就是关闭网页用的。Navigate不用多说就是URL。

我们必须要等网页完全加载完才能开始信息的抓取,这个时候使用到:(从这里开始,所有的代码都需要写在With代码块中


While .ReadyState <> 4 Or .Busy DoEvents WendBusy是网页忙碌状态,ReadyState是HTTP的5种就绪状态,对应如下:

  • 0:请求未初始化(还没有调用 open())。
  • 1:请求已经建立,但是还没有发送(还没有调用 send())。
  • 2:请求已发送,正在处理中(通常现在可以从响应中获取内容头)。
  • 3:请求在处理中;通常响应中已有部分数据可用了,但是服务器还没有完成响应的生成。
  • 4:响应已完成;您可以获取并使用服务器的响应了。
2、获取信息


我们先把页面中的所有内容抓下来,后期筛选出有用的部分再慢慢给抓取添加条件。

Set dmt = .Document For i = 0 To dmt.all.Length - 1 Set htMent = dmt.all(i) With ActiveSheet .Cells(i + 2, "A") = htMent.tagName .Cells(i + 2, "B") = TypeName(htMent) .Cells(i + 2, "C") = htMent.ID .Cells(i + 2, "D") = htMent.Name .Cells(i + 2, "E") = htMent.Value .Cells(i + 2, "F") = htMent.Text .Cells(i + 2, "G") = htMent.innerText End With Next i这块代码和JS有些相似,需要从IE.Document.all中把页面上所有节点找出来。这里也提供其他几种方法:

这些都是在抓取了全部页面内容后帮助筛选有效信息时使用起来比较方便的。当然all还是最好用的,因为all也存在all("IDName")以及all.IDName等等用法。

上面代码部分返回的属性值都是HTML基本内容,就不一一解释了。

3、填充信息

网抓神器当然还是Python,大部分人使用Excel的目的还是在于对页面内容进行自动填充,直接让表格提交网页,问卷录入之类的工作都省心不少。在抓取了页面内容之后,想填充更加是易如反掌的事情,只需要直接给页面标签的Value属性赋值就可以了。

不过网页中除了文本框,可能还存在一些其他没有Value的标签,比如:下拉菜单、单选框。给这些内容赋值就需要一些基本的HTML知识了。

'下拉菜单选择.all("select")(0).Selected = True'单选按钮选择.all("radio").Checked = True'复选按钮选择.all("checkbox").Checked = True下拉菜单是select标签,每个选项都在一个option标签里,所以返回一个集合,需要选中某个选项就要修改对应的Selected属性为True。单选和复选按钮都是input标签,区别在于类型分别是radio和checkbox,要选中某个选项需要修改对应的Checked属性。

三、数据接口

有时候我们能直接拿到一些API,通过API返回数据当然比打开网页更方便快捷,所使用的方法也有一些不太一样。

1、请求接口

比如我从网上得到一个能通过城市查询免费WIFI的API,通过Excel接口访问就使用下面的代码:(虽然是免费的,为了避免麻烦还是把我的AppKey隐去了)


Dim http Set http = CreateObject("Microsoft.XMLHTTP") http.Open "GET", "http://api.avatardata.cn/Wifi/QueryByCity", False http.setRequestHeader "CONTENT-TYPE", "application/x-www-form-urlencoded" http.send "key=[AppKey]&city=杭州&page=1"这时我们创建的对象就不再是IE,而是HTTP对象。这里用的是ajax的Open方法,GET是数据发送方式,第二个参数是接口地址,第三个参数是指定请求方式是否为异步。如果这个API有帐号密码,分别写在第四第五个参数。

setRequestHeader就是给接口发送一个HTTP协议头文件,最后send的内容是接口参数。当然,这个QueryString也可以直接写在URL里,send一个空字符串就可以了。


2、接口返回

接口返回获取的方式很简单:

If http.Status = 200 Then Range("A1").Value = http.responseText这里的HTTP状态又变成200了,和之前说好的不一样啊摔~有兴趣可以自己查查具体有哪些。

不过接口返回要么是JSON要么是XML,Excel处理起来十分不方便。这里提供一个处理JSON的方法,是从网上找来的类模块,具体内容放在附录里。在添加了这个clsJSON类模块后,对JSON的处理就变得十分简单了。

将上面的代码改成:

If http.Status = 200 Then Dim json$ json = http.responseText Dim objJSON As New clsJSON, dicJSON As Object Set dicJSON = objJSON.parse(json) For i = 1 To dicJSON("result")("data").Count Sheet1.Cells(i + 1, 1) = dicJSON("result")("data")(i)("name") Sheet1.Cells(i + 1, 2) = dicJSON("result")("data")(i)("intro") Sheet1.Cells(i + 1, 3) = dicJSON("result")("data")(i)("address") Next i End If接口返回的示例我也放在附录里了,根据接口返回的对象名、数组名去修改dicJSON后面的内容就可以了。这个处理JSON的模块用的是VBA中字典+集合的原理,所以数据处理后的调用方式也参照字典和集合。


以上是我用Excel+VBA进行网页操作的一些个人经验,希望能帮助到一些有需要的人。有什么错漏的地方,也希望本站大牛批评指正。


附录一:VBA处理JSON的类模块

Option Explicit'================================' VBA处理JSON文件的类模块'' http://www.cnhup.com'================================Const INVALID_JSON As Long = 1Const INVALID_OBJECT As Long = 2Const INVALID_ARRAY As Long = 3Const INVALID_BOOLEAN As Long = 4Const INVALID_NULL As Long = 5Const INVALID_KEY As Long = 6Private Sub Class_Initialize()End SubPrivate Sub Class_Terminate()End SubPublic Function parse(ByRef str As String) As Object Dim index As Long index = 1 On Error Resume Next Call skipChar(str, index) Select Case Mid(str, index, 1) Case "{" Set parse = parseObject(str, index) Case "[" Set parse = parseArray(str, index) End SelectEnd FunctionPrivate Function parseObject(ByRef str As String, ByRef index As Long) As Object Set parseObject = CreateObject("Scripting.Dictionary") ' "{" Call skipChar(str, index) If Mid(str, index, 1) <> "{" Then Err.Raise vbObjectError + INVALID_OBJECT, Description:="char " & index & " : " & Mid(str, index) index = index + 1 Do Call skipChar(str, index) If "}" = Mid(str, index, 1) Then index = index + 1 Exit Do ElseIf "," = Mid(str, index, 1) Then index = index + 1 Call skipChar(str, index) End If Dim key As String ' add key/value pair parseObject.Add key:=parseKey(str, index), Item:=parseValue(str, index) LoopEnd FunctionPrivate Function parseArray(ByRef str As String, ByRef index As Long) As Collection Set parseArray = New Collection ' "[" Call skipChar(str, index) If Mid(str, index, 1) <> "[" Then Err.Raise vbObjectError + INVALID_ARRAY, Description:="char " & index & " : " + Mid(str, index) index = index + 1 Do Call skipChar(str, index) If "]" = Mid(str, index, 1) Then index = index + 1 Exit Do ElseIf "," = Mid(str, index, 1) Then index = index + 1 Call skipChar(str, index) End If ' add value parseArray.Add parseValue(str, index) LoopEnd FunctionPrivate Function parseValue(ByRef str As String, ByRef index As Long) Call skipChar(str, index) Select Case Mid(str, index, 1) Case "{" Set parseValue = parseObject(str, index) Case "[" Set parseValue = parseArray(str, index) Case """", "'" parseValue = parseString(str, index) Case "t", "f" parseValue = parseBoolean(str, index) Case "n" parseValue = parseNull(str, index) Case Else parseValue = parseNumber(str, index) End SelectEnd FunctionPrivate Function parseString(ByRef str As String, ByRef index As Long) As String Dim quote As String Dim char As String Dim code As String Call skipChar(str, index) quote = Mid(str, index, 1) index = index + 1 Do While index > 0 And index <= Len(str) char = Mid(str, index, 1) Select Case (char) Case "/" index = index + 1 char = Mid(str, index, 1) Select Case (char) Case """", "//", "/" parseString = parseString & char index = index + 1 Case "b" parseString = parseString & vbBack index = index + 1 Case "f" parseString = parseString & vbFormFeed index = index + 1 Case "n" parseString = parseString & vbNewLine index = index + 1 Case "r" parseString = parseString & vbCr index = index + 1 Case "t" parseString = parseString & vbTab index = index + 1 Case "u" index = index + 1 code = Mid(str, index, 4) parseString = parseString & ChrW(Val("&h" + code)) index = index + 4 End Select Case quote index = index + 1 Exit Function Case Else parseString = parseString & char index = index + 1 End Select LoopEnd FunctionPrivate Function parseNumber(ByRef str As String, ByRef index As Long) Dim value As String Dim char As String Call skipChar(str, index) Do While index > 0 And index <= Len(str) char = Mid(str, index, 1) If InStr("+-0123456789.eE", char) Then value = value & char index = index + 1 Else If InStr(value, ".") Or InStr(value, "e") Or InStr(value, "E") Then parseNumber = CDbl(value) Else parseNumber = CInt(value) End If Exit Function End If LoopEnd FunctionPrivate Function parseBoolean(ByRef str As String, ByRef index As Long) As Boolean Call skipChar(str, index) If Mid(str, index, 4) = "true" Then parseBoolean = True index = index + 4 ElseIf Mid(str, index, 5) = "false" Then parseBoolean = False index = index + 5 Else Err.Raise vbObjectError + INVALID_BOOLEAN, Description:="char " & index & " : " & Mid(str, index) End IfEnd FunctionPrivate Function parseNull(ByRef str As String, ByRef index As Long) Call skipChar(str, index) If Mid(str, index, 4) = "null" Then parseNull = Null index = index + 4 Else Err.Raise vbObjectError + INVALID_NULL, Description:="char " & index & " : " & Mid(str, index) End IfEnd FunctionPrivate Function parseKey(ByRef str As String, ByRef index As Long) As String Dim dquote As Boolean Dim squote As Boolean Dim char As String Call skipChar(str, index) Do While index > 0 And index <= Len(str) char = Mid(str, index, 1) Select Case (char) Case """" dquote = Not dquote index = index + 1 If Not dquote Then Call skipChar(str, index) If Mid(str, index, 1) <> ":" Then Err.Raise vbObjectError + INVALID_KEY, Description:="char " & index & " : " & parseKey End If End If Case "'" squote = Not squote index = index + 1 If Not squote Then Call skipChar(str, index) If Mid(str, index, 1) <> ":" Then Err.Raise vbObjectError + INVALID_KEY, Description:="char " & index & " : " & parseKey End If End If Case ":" If Not dquote And Not squote Then index = index + 1 Exit Do End If Case Else If InStr(vbCrLf & vbCr & vbLf & vbTab & " ", char) Then Else parseKey = parseKey & char End If index = index + 1 End Select LoopEnd FunctionPublic Sub skipChar(ByRef str As String, ByRef index As Long) While index > 0 And index <= Len(str) And InStr(vbCrLf & vbCr & vbLf & vbTab & " ", Mid(str, index, 1)) index = index + 1 WendEnd SubPublic Function toString(ByRef obj As Variant) As String Select Case VarType(obj) Case vbNull toString = "null" Case vbDate toString = """" & CStr(obj) & """" Case vbString toString = """" & encode(obj) & """" Case vbObject Dim bFI, i bFI = True If TypeName(obj) = "Dictionary" Then toString = toString & "{" Dim keys keys = obj.keys For i = 0 To obj.Count - 1 If bFI Then bFI = False Else toString = toString & "," Dim key key = keys(i) toString = toString & """" & key & """:" & toString(obj(key)) Next i toString = toString & "}" ElseIf TypeName(obj) = "Collection" Then toString = toString & "[" Dim value For Each value In obj If bFI Then bFI = False Else toString = toString & "," toString = toString & toString(value) Next value toString = toString & "]" End If Case vbBoolean If obj Then toString = "true" Else toString = "false" Case vbVariant, vbArray, vbArray + vbVariant Dim sEB toString = multiArray(obj, 1, "", sEB) Case Else toString = Replace(obj, ",", ".") End SelectEnd FunctionPrivate Function encode(str) As String Dim i, j, aL1, aL2, c, p aL1 = Array(&H22, &H5C, &H2F, &H8, &HC, &HA, &HD, &H9) aL2 = Array(&H22, &H5C, &H2F, &H62, &H66, &H6E, &H72, &H74) For i = 1 To Len(str) p = True c = Mid(str, i, 1) For j = 0 To 7 If c = Chr(aL1(j)) Then encode = encode & "/" & Chr(aL2(j)) p = False Exit For End If Next If p Then Dim a a = AscW(c) If a > 31 And a < 127 Then encode = encode & c ElseIf a > -1 Or a < 65535 Then encode = encode & "/u" & String(4 - Len(Hex(a)), "0") & Hex(a) End If End If NextEnd FunctionPrivate Function multiArray(aBD, iBC, sPS, ByRef sPT) ' Array BoDy, Integer BaseCount, String PoSition Dim iDU, iDL, i ' Integer DimensionUBound, Integer DimensionLBound On Error Resume Next iDL = LBound(aBD, iBC) iDU = UBound(aBD, iBC) Dim sPB1, sPB2 ' String PointBuffer1, String PointBuffer2 If Err.Number = 9 Then sPB1 = sPT & sPS For i = 1 To Len(sPB1) If i <> 1 Then sPB2 = sPB2 & "," sPB2 = sPB2 & Mid(sPB1, i, 1) Next' multiArray = multiArray & toString(Eval("aBD(" & sPB2 & ")")) multiArray = multiArray & toString(aBD(sPB2)) Else sPT = sPT & sPS multiArray = multiArray & "[" For i = iDL To iDU multiArray = multiArray & multiArray(aBD, iBC + 1, i, sPT) If i < iDU Then multiArray = multiArray & "," Next multiArray = multiArray & "]" sPT = Left(sPT, iBC - 2) End If Err.ClearEnd Function

附录二:JSON返回示例

{"resultcode":"200","reason":"ReturnSuccessd!","result":{"data":[{"name":"杭州市法雨合","intro":"法雨合0层","address":"杭州市朝阳区朝阳区三里屯","google_lat":"39.9372423","google_lon":"116.4480615","baidu_lat":"39.942952987502","baidu_lon":"116.45464108129","province":"杭州市","city":"杭州市"},{"name":"杭州朝阳西坝河光熙门北里","intro":"朝阳西坝河光熙门北里34-8号0层","address":"杭州市朝阳区朝阳区西坝河光熙门北里34号-8号0层","google_lat":"39.9635121","google_lon":"116.435895","baidu_lat":"39.969407173324","baidu_lon":"116.44243487981","province":"杭州市","city":"杭州市"},{"name":"杭州朝阳三里屯北街","intro":"","address":"杭州市朝阳区朝阳三里屯北街8号0层","google_lat":"39.9254286","google_lon":"116.4605935","baidu_lat":"39.931073085771","baidu_lon":"116.46719483818","province":"杭州市","city":"杭州市"},{"name":"杭州大都酒吧街","intro":"","address":"杭州市朝阳区元大都酒吧街11号","google_lat":"39.975984","google_lon":"116.424389","baidu_lat":"39.982089966811","baidu_lon":"116.43086831752","province":"杭州市","city":"杭州市"},{"name":"杭州西城前海北沿","intro":"","address":"杭州市西城区西城前海北沿10号0层","google_lat":"39.9369032","google_lon":"116.3919335","baidu_lat":"39.943215619704","baidu_lon":"116.39830652238","province":"杭州市","city":"杭州市"},{"name":"杭州市西城后浙江沿36号对面","intro":"后浙江沿36号对面0层","address":"杭州市西城区后浙江沿36号","google_lat":"39.9396792","google_lon":"116.389129","baidu_lat":"39.945967638433","baidu_lon":"116.39551153315","province":"杭州市","city":"杭州市"},{"name":"杭州市赛百味","intro":"ok","address":"杭州市西城区中关村东路18号","google_lat":"39.9810991","google_lon":"116.3333866","baidu_lat":"39.9867766224","baidu_lon":"116.34001632032","province":"杭州市","city":"杭州市"},{"name":"杭州市光华路数码01","intro":"","address":"杭州市朝阳区光华路数码01大厦0层","google_lat":"39.9132392","google_lon":"116.4592309","baidu_lat":"39.918885961978","baidu_lon":"116.46583845234","province":"杭州市","city":"杭州市"},{"name":"杭州市盛铭帮逸园会馆","intro":"盛铭帮逸园会馆0","address":"杭州市朝阳区逸园25号","google_lat":"39.8710876","google_lon":"116.4602965","baidu_lat":"39.876744728506","baidu_lon":"116.46693498949","province":"杭州市","city":"杭州市"},{"name":"杭州市地平线酒吧","intro":"","address":"杭州市朝阳区朝阳三里屯北街70号","google_lat":"39.9254286","google_lon":"116.4605935","baidu_lat":"39.931073085771","baidu_lon":"116.46719483818","province":"杭州市","city":"杭州市"}],"pageinfo":{"pnums":20,"current":1 } }}

关键词:操作,使用

74
73
25
news

版权所有© 亿企邦 1997-2025 保留一切法律许可权利。

为了最佳展示效果,本站不支持IE9及以下版本的浏览器,建议您使用谷歌Chrome浏览器。 点击下载Chrome浏览器
关闭