您现在的位置是:首页 >技术杂谈 >c# 通过webView2模拟登陆小红书网页版,解析无水印视频图片,以及解决X-s,X-t签名验证【2023年4月15日】网站首页技术杂谈
c# 通过webView2模拟登陆小红书网页版,解析无水印视频图片,以及解决X-s,X-t签名验证【2023年4月15日】
一、c# WebView2简介
1.一开始使用WebBrowser,因为WebBrowser控件使用的是ie内核,经过修改注册表切换为Edge内核后,
发现Edge内核版本较低,加载一些视频网站提示“浏览器版本过低“,”视频无法加载“。
2.WebBrowser内核版本与WebView2比较
WebBrowser内核版本:
内核版本 (Version) Edge 18.9200 兼容 WebKit 537.36 Chrome 70
UserAgent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.9200
当前Edge内核版本:
内核版本 (Version) WebKit 537.36 Chrome 111.0.0.0
UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.54
WebView2内核版本:
内核版本 (Version) WebKit 537.36 Chrome 111.0.0.0
UserAgent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.51
可见,WebView2内核版本跟Edge一样,能顺利打开视频网站。WebBrowser内核版本过低。
WebView2加载视频网站
3.WebView2概述
Microsoft Edge WebView2 控件允许在本机应用中嵌入 web 技术(HTML、CSS 以及 JavaScript)。 WebView2 控件使用 Microsoft Edge 作为绘制引擎,以在本机应用中显示 web 内容。
虽说无法跨平台,但是在windows应用下做为原浏览器控件替代品还是不错的。
4.安装webview2
打开NuGet,搜索WebView2,安装之后,可以看到左侧就有了webview2控件,可以直接拖到窗体内。
二、问题分析
1、关于登陆会话的问题
网页端必须打开小红书网站。小红书打开后,在浏览器Cookie里,有一个字段:
web_session=040069b3b3f6625dade26f8d1d364b44f72186
这是记录登陆会话信息的。请求时headers中需要x-s、x-t,cookie中需要有web_session。
经测试,这个web_session会在浏览器保存一段时间,具体多久还有待验证(B站也有个类似的session,是一个月)。其它字段无关紧要。
不使用WebView2打开网站的话,需要到网站申请web_session,这里WebView2已经替我们弄好了。
通过c# WebView2获取cookie信息的方法:
private Dictionary<string, string> mCookies = new Dictionary<string, string>();//保存Cookie到字典中
/// <summary>
/// WebView2异步获取cookie
/// </summary>
/// <param name="url">与cookie关联的域名</param>
private async void getCookie(string url)
{
List<CoreWebView2Cookie> cookieList = await webView.CoreWebView2.CookieManager.GetCookiesAsync(url);
mCookies.Clear();
for (int i = 0; i < cookieList.Count; ++i)
{
CoreWebView2Cookie cookie = webView.CoreWebView2.CookieManager.CreateCookieWithSystemNetCookie(cookieList[i].ToSystemNetCookie());
mCookies.Add(cookie.Name, cookie.Value);
}
}
/// <summary>
/// 提取cookie中的一个字段;
/// </summary>
/// <param name="url">域名</param>
/// <param name="key">关键字,如:web_session</param>
/// <param name="t">延时(没用到)</param>
/// <returns></returns>
public string getCookieEx(string url, string key, int t)
{
getCookie(url);
if (mCookies.ContainsKey(key))
{
string cookies = "";
foreach (var cookie in mCookies)
{
cookies += cookie.Key + "=" + cookie.Value + ";";
}
cookies = key + "=" + mCookies[key];
return cookies;
}
return null;
}
2.笔记信息接口
目前笔记信息接口: /api/sns/web/v1/feed
请求时headers中需要x-s、x-t,cookie中需要有web_session。
3.X-S
定位方法很多,可以全局搜 "X-s" 。往上找可以发现该段为 sign 方法,function sign(e, t) {}
全部复制到本地,然后根据报错把缺的方法和环境补一下,比如a0_0x4dee00、a0_0x5c27、a0_0x543e等方法,
然后把常用的navigator、location、document、window加上就好了。
该过程中根据具体错误再调试分析, 比如sign方法的 case "6",修改为var vr = window 、在case "7"中可以手动修改为 dr = ur['sNYMU']
这里已经拿到了function sign(e, t) {}的JavaScript版,需要的+v:byc6352
三、WebView2中C#和JavaScript代码互操作
1.需要创建一个ScriptHost对象,并注册到WebView2中:
/// <summary>
/// 网页调用C#方法
/// </summary>
[ClassInterface(ClassInterfaceType.AutoDual)]
[ComVisible(true)]
public class ScriptHost
2.在WebView2初始化完成事件中注册ScriptHost对象
/// <summary>
/// CoreWebView2初始化完成
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
private void webView_CoreWebView2InitializationCompleted(object sender, CoreWebView2InitializationCompletedEventArgs e)
{
//注册winning,winasync脚本c#互操作
webView.CoreWebView2.AddHostObjectToScript("scriptHost", scriptHost);
//注册全局变量winning 同步操作;winasync:异步操作;
webView.CoreWebView2.AddScriptToExecuteOnDocumentCreatedAsync("var winasync= window.chrome.webview.hostObjects.scriptHost;");
webView.CoreWebView2.AddScriptToExecuteOnDocumentCreatedAsync("var winning= window.chrome.webview.hostObjects.sync.scriptHost;");
}
3.ScriptHost中暴露的公共方法,都可以在前端JavaScript中调用。
/// <summary>
/// 日志记录(JavaScript前端调用
/// </summary>
/// <param name="message">JavaScript前端信息</param>
public void log(string message)
{
Log.i(message);//记录到文本文件中
//MessageBox.Show(message);
}
..............................................
winning.log(data);//JavaScript端调用(同步调用);
四、下载封面及视频
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Configuration;
using System.Text;
using System.Threading.Tasks;
namespace XhsVideo
{
/// <summary>
/// 下载视频
/// </summary>
internal class VideoDown
{
public int id { get; set; }
public VideoInfo video { get; set; }
public string savedir { get; set; }
public string msg { get; set; }
public bool success { get; set; }
public string headers { get; set; }
public VideoDown(int id,VideoInfo video,string savdir,string headers=null) {
this.id = id;
this.video = video;
this.savedir = savdir;
if (System.IO.Directory.Exists(savdir)) System.IO.Directory.CreateDirectory(savdir);
this.headers = headers;
}
public void process()
{
string filename = MakeValidFileName( video.title,""); //去除文件名中的非法字符
string videoname = savedir + "\" + filename + ".mp4";
string covername = savedir + "\" + filename + ".webp";
if (System.IO.File.Exists(covername)) System.IO.File.Delete(covername);
if (System.IO.File.Exists(videoname)) System.IO.File.Delete(videoname);
Log.i(videoname);
Log.i(covername);
//NetHelper.downloadfileAsync(video.videoUrl, videoname);
NetHelper.downloadfileAsync(video.coverUrl, covername);
NetHelper.DownloadFileAsync(video.videoUrl, videoname, showProgress);//显示下载视频的进度
}
/**
* @param text: 原始串
* @param replacement: 要替换的字符串
*/
public static string MakeValidFileName(string text, string replacement = "_")
{
StringBuilder str = new StringBuilder();
var invalidFileNameChars = System.IO.Path.GetInvalidFileNameChars();
foreach (var c in text)
{
if (invalidFileNameChars.Contains(c))
{
str.Append(replacement ?? "");
}
else
{
str.Append(c);
}
}
return str.ToString();
}
/// <summary>
/// 下载视频的进度回调函数
/// </summary>
/// <param name="msg">下载进度信息</param>
public void showProgress(string msg)
{
this.msg = msg;
Log.i(msg);
fMainForm.GetFMainForm().syncContext.Send(fMainForm.GetFMainForm().SetTextSafePost, this);//Post 将信息发送到窗体显示
}
}
/// <summary>
/// 视频信息类,由:标题,封面地址,视频地址组成。
/// </summary>
public class VideoInfo
{
public string title { get; set; }
public string coverUrl { get; set; }
public string videoUrl { get; set; }
public VideoInfo(string title,string coverUrl,string videoUrl)
{
this.title = title;
this.coverUrl = coverUrl;
this.videoUrl = videoUrl;
}
}
}
五、日志记录
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace XhsVideo
{
/// <summary>
/// 日志记录 到文本文件
/// </summary>
internal class Log
{
private static Log log;
private static string logName;
private Log(string filename) {
logName = filename;
}
public static Log GetLog(string filename)
{
if (log == null) { log = new Log(filename); }
return log;
}
public static Log GetLog()
{
return log;
}
public static void i(string msg)
{
string now = DateTime.Now.ToString();
string[] text = new string[2];
text[0] = now;
text[1] = msg;
try
{
using (StreamWriter sw = new StreamWriter(logName,true,Encoding.UTF8))
{
foreach (string s in text)
{
sw.WriteLine(s);
}
sw.Close();
}
}
catch (Exception e)
{
//Console.WriteLine("Exception: " + e.Message);
}
finally
{
//Console.WriteLine("Executing finally block.");
}
}
}
}
六、网络访问组件
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http.Headers;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using System.Net;
using System.IO;
using System.Windows.Forms;
namespace XhsVideo
{
/// <summary>
/// 网络访问组件
/// </summary>
internal class NetHelper
{
#region 文件下载
/// <summary>
/// 下载文件
/// </summary>
/// <param name="url">文件下载地址</param>
/// <param name="savePath">本地保存路径+名称</param>
/// <param name="downloadCallBack">下载回调(总长度,已下载,进度)</param>
/// <returns></returns>
/// <exception cref="Exception"></exception>
public static async Task DownloadFileAsync(string url, string savePath, Action<string> downloadCallBack = null)
{
try
{
downloadCallBack?.Invoke($"文件【{url}】开始下载!");
HttpResponseMessage response = null;
using (HttpClient client = new HttpClient())
response = await client.GetAsync(url);
if (response == null)
{
downloadCallBack?.Invoke("文件获取失败");
return;
}
var total = response.Content.Headers.ContentLength ?? 0;
var stream = await response.Content.ReadAsStreamAsync();
var file = new FileInfo(savePath);
using (var fileStream = file.Create())
using (stream)
{
if (downloadCallBack == null)
{
await stream.CopyToAsync(fileStream);
downloadCallBack?.Invoke($"文件【{url}】下载完成!");
}
else
{
byte[] buffer = new byte[1024];
long readLength = 0;
int length;
double temp = 0;
string msg = "";
while ((length = await stream.ReadAsync(buffer, 0, buffer.Length)) != 0)
{
// 写入到文件
fileStream.Write(buffer, 0, length);
//更新进度
readLength += length;
double progress = Math.Round((double)readLength / total * 100, 2, MidpointRounding.AwayFromZero);//.ToZero
if ((progress % 1) == 0 && (progress % 1) != temp)
{
msg = $"总大小:【{total}】,已下载:【{readLength}】,进度:【{progress}】";
downloadCallBack?.Invoke(msg);
}
temp = progress % 1;
//下载完毕立刻关闭释放文件流
if (total == readLength && progress == 100)
{
fileStream.Close();
fileStream.Dispose();
msg = $"总大小:【{total}】,已下载:【{readLength}】,进度:【{progress}】下载完成。";
downloadCallBack?.Invoke(msg);
}
}
}
}
}
catch (Exception ex)
{
downloadCallBack?.Invoke($"下载文件失败:{ex.Message}!");
}
}
#endregion
/// <summary>
/// 异步下载文件
/// </summary>
/// <param name="url"></param>
/// <param name="filename"></param>
public static async void downloadfileAsync(string url,string filename)
{
using (var web = new WebClient())
{
await web.DownloadFileTaskAsync(url, filename);
}
}
/// <summary>
/// 地址重定向
/// </summary>
/// <param name="url">原地址</param>
/// <param name="domain">域名</param>
/// <param name="ua">userAgent</param>
/// <returns>重定向后的地址</returns>
public static string getRedirectedUrl(string url, string domain, string ua)
{
var str = getRedirectedUrl_T(url);
return str.Result;
}
/// <summary>
/// 异步地址重定向
/// </summary>
/// <param name="url"></param>
/// <returns></returns>
private static async Task<string> getRedirectedUrl_T(string url)
{
try
{
var handler = new HttpClientHandler()
{
AllowAutoRedirect = false
};
var client = new HttpClient(handler);
var response = await client.GetAsync(url).ConfigureAwait(continueOnCapturedContext: false);
int statuscode = (int)response.StatusCode;
if (statuscode == 307)
{
string location = response.Headers.Location.ToString();
return location;
}
if (statuscode == 302)
{
string location = response.Headers.Location.ToString();
return location;
}
else
{
return "";
}
}
catch (Exception e)
{
System.Diagnostics.Debug.WriteLine("getRedirectedUrl_T:" + e.ToString());
return "";
}
}
//---------------------------------------------------------------post---------------------------------------------------------------------
/// <summary>
/// Post访问
/// </summary>
/// <param name="url">访问地址</param>
/// <param name="args">数据</param>
/// <param name="headers">HTTP头</param>
/// <returns>返回服务器JSON数据</returns>
public static string getPostResult(string url, string args, string headers)
{
var str = getPostResult_T(url, args, headers);
return str.Result;
}
/// <summary>
/// 异步post调用
/// </summary>
/// <param name="url"></param>
/// <param name="args"></param>
/// <param name="headers"></param>
/// <returns></returns>
private static async Task<string> getPostResult_T(string url, string args, string headers)
{
try
{
var handler = new HttpClientHandler() { UseCookies = false };
var client = new HttpClient(handler);// { BaseAddress = baseAddress };
client.Timeout = TimeSpan.FromSeconds(20);
var message = new HttpRequestMessage(HttpMethod.Post, url);
message.Content = new StringContent(args);
message.Content.Headers.ContentType = new MediaTypeHeaderValue("application/json");
//message.Headers;
Dictionary<string, string> dictionary = ParseToDictionary(headers);
foreach (var pair in dictionary)
{
if (pair.Key.Equals("content-type")) continue;
if (pair.Key.Equals("Content-Type")) continue;
message.Headers.Add(pair.Key, pair.Value);
}
var result = await client.SendAsync(message).ConfigureAwait(continueOnCapturedContext: false);
result.EnsureSuccessStatusCode();
return await result.Content.ReadAsStringAsync();
}
catch (Exception e)
{
//EventLog.GetEventLogs(e.ToString);
System.Diagnostics.Debug.WriteLine("getPostResultEx_T:" + e.ToString());
///console.write(e.ToString());
return "";
}
}
/// <summary>
/// 字符串解析为字典数据
/// </summary>
/// <param name="str">以回车换行符分割的字符串</param>
/// <returns>字典数据</returns>
private static Dictionary<string, string> ParseToDictionary(string str)
{
Dictionary<string, string> result = new Dictionary<string, string>();
string str1 = str;
int i = str1.IndexOf("
");
int j = 0;
while (i > 0)
{
string str2 = str1.Substring(0, i);
j = str2.IndexOf(":");
if (j > 0)
{
string str21 = str2.Substring(0, j);
string str22 = str2.Substring(j + 1);
result.Add(str21, str22);
}
str1 = str1.Substring(i + 2);
i = str1.IndexOf("
");
}
j = str1.IndexOf(":");
if (j > 0)
{
string str21 = str1.Substring(0, j);
string str22 = str1.Substring(j + 1);
result.Add(str21, str22);
}
return result;
}
//-------------------------------------------------------------------GET -------------------------------------------------------------------
/// <summary>
/// 获取服务器端的HTML代码
/// </summary>
/// <param name="url"></param>
/// <returns></returns>
public static string getHtmlCode(string url)
{
var str = getHtmlCode_T(url);
return str.Result;
}
/// <summary>
/// 异步获取服务器端的HTML代码
/// </summary>
/// <param name="url"></param>
/// <returns></returns>
public static async Task<string> getHtmlCode_T(string url)
{
try
{
var http = new HttpClient();
http.Timeout = TimeSpan.FromSeconds(10);
var result = await http.GetStringAsync(url).ConfigureAwait(continueOnCapturedContext: false);
return result;
}
catch (Exception e)
{
//EventLog.GetEventLogs(e.ToString);
System.Diagnostics.Debug.WriteLine("错误信息在这儿:" + e.ToString());
///console.write(e.ToString());
Log.i(e.ToString());
return "";
}
}
}
}
因为涉及的技术知识点太多了,一时半会写不完,后续继续完善。需要源码的加。
(未完待续)