Aggregator
难绷!在伊斯坦布尔被当猪宰
SigmaHQ Rules Release Highlights — r2024–04–29
How space exploration benefits life on Earth: An interview with David Eicher
How we fought bad apps and bad actors in 2023
From Assistant to Analyst: The Power of Gemini 1.5 Pro for Malware Analysis
- A growing amount of malware has naturally increased workloads for defenders and particularly malware analysts, creating a need for improved automation and approaches to dealing with this classic threat.
- With the recent rise in generative AI tools, we decided to put our own Gemini 1.5 Pro to the test to see how it performed at analyzing malware. By providing code and using a simple prompt, we asked Gemini 1.5 Pro to determine if the file was malicious, and also to provide a list of activities and indicators of compromise.
- We did this for multiple malware files, testing with both decompiled and disassembled code, and Gemini 1.5 Pro was notably accurate each time, generating summary reports in human-readable language. Gemini 1.5 Pro was even able to make an accurate determination of code that — at the time — was receiving zero detections on VirusTotal.
- In our testing with other similar gen AI tools, we were required to divide the code into chunks, which led to vague and non-specific outcomes, and affected the overall analysis. Gemini 1.5 Pro, however, processed the entire code in a single pass, and often in about 30 to 40 seconds.
The explosive growth of malware continues to challenge traditional, manual analysis methods, underscoring the urgent need for improved automation and innovative approaches. Generative AI models have become invaluable in some aspects of malware analysis, yet their effectiveness in handling large and complex malware samples has been limited. The introduction of Gemini 1.5 Pro, capable of processing up to 1 million tokens, marks a significant breakthrough. This advancement not only empowers AI to function as a powerful assistant in automating the malware analysis workflow but also significantly scales up the automation of code analysis. By substantially increasing the processing capacity, Gemini 1.5 Pro paves the way for a more adaptive and robust approach to cybersecurity, helping analysts manage the asymmetric volume of threats more effectively and efficiently.
Traditional Techniques for Automated Malware AnalysisThe foundation of automated malware analysis is built on a combination of static and dynamic analysis techniques, both of which play crucial roles in dissecting and understanding malware behavior. Static analysis involves examining the malware without executing it, providing insights into its code structure and unobfuscated logic. Dynamic analysis, on the other hand, involves observing the execution of the malware in a controlled environment to monitor its behavior, regardless of obfuscation. Together, these techniques are leveraged to gain a comprehensive understanding of malware.
Parallel to these techniques, AI and machine learning (ML) have increasingly been employed to classify and cluster malware based on behavioral patterns, signatures, and anomalies. These methodologies have ranged from supervised learning, where models are trained on labeled datasets, to unsupervised learning for clustering, which identifies patterns without predefined labels to group similar malware.
Despite technological advancements, the increasing complexity and volume of malware present substantial challenges. While ML enhances the detection of malware variants, it remains inadequate against completely new threats. This detection gap allows advanced attacks to slip through cybersecurity defenses, compromising system protection.
Generative AI as Malware Analysis AssistantCode Insight, unveiled at the RSA Conference 2023, marked a significant step forward in leveraging generative AI (gen AI) for malware analysis. This novel feature of Google's VirusTotal platform specializes in analyzing code snippets and generating reports in natural language, effectively emulating the approach of a malware analyst. Initially supporting PowerShell scripts, Code Insight later expanded to other scripting languages and file formats, including Batch, Shell, VBScript, and Office documents.
By processing the code and generating summary reports, Code Insight assists analysts in understanding the behavior of the code and identifying attack techniques. This includes uncovering hidden functionalities, malicious intent, and potential attack vectors that might be missed by traditional detection methods.
However, due to the inherent constraints of large language models (LLMs) and their limited token input capacity, the size of files that Code Insight could handle was restricted. Although there have been continuous improvements to increase the maximum file size limit and support more formats, analyzing binaries and executables still poses a significant challenge. When these files are disassembled or decompiled, their code size typically surpasses the processing capabilities of the LLMs available at the time. Consequently, gen AI models have functioned primarily as assistants to human analysts, enabling the analysis of specific code fragments from binaries rather than processing the entire code, which is often too voluminous for these models.
Reverse Engineering: The Human Face of Malware AnalysisReverse engineering is arguably the most advanced malware analysis technique available to cybersecurity professionals. This process involves disassembling the binaries of malicious software and carrying out a meticulous examination of the code. Through reverse engineering, analysts can uncover the exact functionality of malware and understand its execution flow. However, this method is not without its challenges. It requires an immense amount of time, a deep level of expertise, and an analytical mindset to interpret each instruction, data structure, and function call to reconstruct the malware's logic and uncover its secrets.
Furthermore, scaling reverse engineering efforts poses a significant challenge. The scarcity of specialized talent in this field exacerbates the difficulty of conducting these analyses at scale. Given the intricate and time-consuming nature of reverse engineering, the cybersecurity community has long sought ways to augment this process, making it more efficient and accessible.
Gemini 1.5 Pro: Scalable Reverse Engineering for Malware AnalysisThe ability to process prompts of up to 1 million tokens enables a qualitative leap in malware analysis, particularly in the realm of reverse engineering. This advancement finally brings the power of gen AI to the analysis of binaries and executables, a task previously reserved for highly skilled human analysts due to its complexity.
How does Gemini 1.5 Pro achieve this?
- Increased capacity: With its expanded token limit, Gemini 1.5 Pro can entirely analyze some disassembled or decompiled executables in a single pass, eliminating the need to break down code into smaller fragments. This is crucial because fragmenting code can lead to a loss of context and important correlations between different parts of the program. When analyzing only small snippets, it is difficult to understand the overall functionality and behavior of the malware, potentially missing key insights into its purpose and operation. By analyzing the entire code at once, Gemini 1.5 Pro gains a holistic understanding of the malware, allowing for more accurate and comprehensive analysis.
- Code interpretation: Gemini 1.5 Pro can interpret the intent and purpose of the code, not just identify patterns or similarities. This is possible due to its training on a massive dataset of code, encompassing assembly language from various architectures, high-level languages like C, and pseudo-code produced by decompilers. This extensive knowledge base, combined with its understanding of operating systems, networking, and cybersecurity principles, allows Gemini 1.5 Pro to effectively emulate the reasoning and judgment of a malware analyst. As a result, it can predict the malware's actions and provide valuable insights even for never-seen-before threats. For more information on this, see the zero day case study section later in this post.
- Detailed analysis: Gemini 1.5 Pro can generate summary reports in human-readable language, making the analysis process more accessible and efficient. This goes far beyond the simple verdicts typically provided by traditional machine learning algorithms for classification and clustering. Gemini 1.5 Pro's reports can include detailed information about the malware's functionality, behavior, and potential attack vectors, as well as indicators of compromise (IOCs) that can be used to feed other security systems and improve threat detection and prevention capabilities.
Let's explore a practical case study to examine how Gemini 1.5 Pro performs in analyzing decompiled code with a representative malware sample. We processed two WannaCry binaries automatically using the Hex-Rays decompiler, without adding any annotations or additional context. This approach resulted in two C code files, one 268 KB and the other 231 KB in size, which together amount to more than 280,000 tokens for processing by the LLM.
In our testing with other similar gen AI tools, we faced the necessity of dividing the code into chunks. This fragmentation often compromised the comprehensiveness of the analysis, resulting in vague and non-specific outcomes. These limitations highlight the challenges of using such tools with complex code bases.
Gemini 1.5 Pro, however, marks a significant departure from these constraints. It processes the entire decompiled code in a single pass, taking just 34 seconds to deliver its analysis. The initial summary provided by Gemini 1.5 Pro is notably accurate, showcasing its ability to handle large and complex datasets seamlessly and effectively:
- Issues a malicious verdict associated with ransomware
- Identifies some files as IOCs (c.wnry and tasksche.exe)
- Acknowledges the use of an algorithm to generate IP addresses and perform network scans to find targets on port 445/SMB to spread to other computers
- Identifies URL/domain (WannaCry's "killswitch") and relevant registry key and mutex
While it might seem that Gemini 1.5 Pro's report of WannaCry is based on pre-trained knowledge of this specific malware, this isn't the case. The analysis comes from the model's ability to independently interpret the code. This will become even clearer as we look at the upcoming examples where Gemini 1.5 Pro analyzes unfamiliar malware samples, demonstrating its wide-ranging capabilities.
LLM on Code: Disassembled vs. DecompiledIn the previous example showcasing WannaCry analysis, there was a crucial step before feeding the code to the LLM: decompilation. This process, which transforms binary code into a higher-level representation like C, is fully automated and mirrors the initial steps taken by malware analysts when manually dissecting malicious software. But what is the difference between disassembled and decompiled code, and how does it impact LLM analysis?
- Disassembly: This process converts binary code into assembly language, a low-level representation specific to the processor architecture. While human-readable, assembly code is still quite complex and requires significant expertise to understand. It is also much longer and more repetitive than the original source code.
- Decompilation: This process attempts to reconstruct the original source code from the binary. While not always perfect, decompilation can significantly improve readability and conciseness compared to disassembled code. It achieves this by identifying high-level constructs like functions, loops, and variables, making the code easier to understand for analysts.
Given these factors, when using LLMs for binary analysis, decompilation offers several advantages on efficiency and scalability. The shorter and more structured output from decompilation fits more readily within the processing constraints of LLMs, allowing for a more efficient analysis of large or complex binaries. In fact, the output from a decompiler is five to 10 times more concise than that produced by a disassembler.
Disassembly is necessary to perform accurate decompilation and remains an invaluable tool in certain scenarios where detailed, low-level analysis is crucial. Given the structured and higher-level nature of decompiled output, there are specific circumstances where disassembly provides insights that decompilation cannot match.
Fortunately, Gemini 1.5 Pro demonstrates equal capability in processing both high-level languages and assembly across various architectures. Thus, our implementation for automating binary analysis can utilize both strategies or adopt a hybrid approach, as suited to the specific circumstances of each case. This flexibility allows us to tailor our analysis method to the nature of the binary in question, optimizing for efficiency, depth of insight, and the specific objectives of the analysis, whether that means dissecting the logic and flow of the program or diving into the intricate details of its low-level operations.
Next, we'll examine a case where we directly employ disassembly for analysis. This time, we're working with a more recent and unknown binary; in fact, the executable submitted to VirusTotal is flagged as malicious by only four out of the 70 VirusTotal anti-malware engines, and only in a generic sense, without providing any details about the malware family that could offer further clues about its behavior.
After automatic preprocessing with HexRays/IDA Pro, the 306.50 KB executable binary produces a 1.5 MB assembly file that Gemini 1.5 Pro can process in a single pass within 46 seconds , thanks to its large token window in the prompt. This capability allows for an analysis of the entire assembly output, offering detailed insights into the binary's operations.
This case of the unknown binary showcases the remarkable capabilities of Gemini 1.5 Pro. Despite only four out of 70 anti-malware engines on VirusTotal flagging the file as malicious—using only generic signatures—Gemini 1.5 Pro identified the file as malicious, providing a detailed explanation for its verdict. The file is likely a game cheat designed to inject a game hack dynamic-link library (DLL) into the Grand Theft Auto video game process. The designation of "malicious" may depend on perspective: deemed malicious by the game's developers or their security team focused on anti-cheating measures, yet potentially desirable for some players. Nevertheless, this automated first-pass analysis is not only impressive but also illuminating regarding the nature and intent of the binary.
Unveiling the Unknown: A Case Study in Zero-Day DetectionThe true test of any malware analysis tool lies in its ability to identify never-before-seen threats undetected by traditional methods and proactively protecting systems from zero-day attacks. Here, we examine a case where an executable file is undetected by any anti-virus or sandbox on VirusTotal.
The 833 KB file, medui.exe, was decompiled into 189,080 tokens and subsequently processed by Gemini 1.5 Pro in a mere 27 seconds to produce a complete malware analysis report in a single pass.
This analysis revealed suspicious functionalities, leading Gemini 1.5 Pro to issue a malicious verdict. Based on its observations, it concluded that the primary goal of this malware is to steal cryptocurrency by hijacking Bitcoin transactions and evading detection through the disabling of security software.
This showcases Gemini's ability to go beyond simple pattern matching or ML classification and leverage its deep understanding of code behavior to identify malicious intent, even in previously unseen threats. This is a significant advancement in the field of malware analysis, as it allows us to proactively detect and respond to new and emerging threats that traditional methods might miss.
From Assistant to AnalystGemini 1.5 Pro unlocks impressive capabilities, enabling the analysis of large volumes of decompiled and disassembled code. It has the potential to significantly change our approach to fighting malware by enhancing efficiency, accuracy, and our ability to scale in response to a growing number of threats.
However, it's important to remember that this is just the beginning. While Gemini 1.5 Pro represents a significant leap forward, the field of gen AI is still in its infancy. There are several challenges that need to be addressed to achieve truly robust and reliable automated malware analysis:
- Obfuscation and packing: Malware authors are constantly developing new techniques to obfuscate their code and evade detection. In response, there's a growing need to not only continuously improve gen AI models but also to enhance the preprocessing of binaries before analysis. Adopting dynamic approaches that utilize various preprocessing tools can more effectively unpack and deobfuscate malware. This preparatory step is crucial for enabling gen AI models to accurately analyze the underlying code, ensuring they keep pace with evolving obfuscation techniques and remain effective in detecting and understanding sophisticated malware threats.
- Increasing binary size: The complexity of modern software is mirrored in the growing size of its binaries. This trend presents a significant challenge, as the majority of gen AI models are constrained by much lower token window limits. In contrast, Gemini 1.5 Pro stands out by supporting up to 1 million tokens—currently the highest known capacity in the field. Nevertheless, even with this remarkable capability, Gemini 1.5 Pro may encounter limitations when handling exceptionally large binaries. This underscores the ongoing need for advancements in AI technology to accommodate the analysis of increasingly large files, ensuring comprehensive and effective malware analysis as software complexity continues to escalate.
- Evolving attack techniques: As attackers continuously innovate, crafting new methods to bypass security measures, the challenge for gen AI models extends beyond simple adaptability. These models must not only learn and recognize new threats but also evolve in conjunction with the efforts of researchers and developers. There's a need to devise new methods for automating the preprocessing of threat data, which would enrich the context provided to AI models. For instance, integrating additional data from static and dynamic analysis tools, such as sandbox reports, plus the decompiled and disassembled code, can significantly enhance the models' understanding and detection capabilities.
The journey towards scaling automated malware analysis is ongoing, but Gemini 1.5 Pro marks a significant milestone. Give Gemini 1.5 Pro a try; we look forward to seeing the innovative ways the community leverages it to enhance security operations.
At GSEC Malaga, we continue to research and develop ways to apply these models effectively in AI, pushing the boundaries of what's possible in cybersecurity and contributing to a safer digital future.
Malware DetailsThe following table contains details on the malware samples discussed in this post.
Filename
SHA-256 Hash
Size
First Seen
File Type
lhdfrgui.exe (WannaCry dropper)
24d004a104d4d54034dbcffc2a4b19a11f39008a575aa614ea04703480b1022c
3.55 MB (3723264 bytes)
2017-05-12
Win32 EXE
tasksche.exe (WannaCry cryptor)
ed01ebfbc9eb5bbea545af4d01bf5f1071661840480439c6e5babe8e080e41aa
3.35 MB (3514368 bytes)
2017-05-12
Win32 EXE
EXEC.exe
1917ec456c371778a32bdd74e113b07f33208740327c3cfef268898cbe4efbfe
306.50 KB (313856 bytes)
2022-04-18
Win32 EXE
medui.exe
719b44d93ab39b4fe6113825349addfe5bd411b4d25081916561f9c403599e50
833.50 KB (853504 bytes)
2024-03-27
Win32 EXE
PromptThe following is the exact prompt used in all the examples covered in the post. The only exception is the example where the word "disassembled" is used instead of "decompiled" because, as explained, we're working with disassembled code rather than decompiled code to show that Gemini 1.5 Pro can interpret both.
Act as a malware analyst by thoroughly examining this decompiled executable code. Methodically break down each step, focusing keenly on understanding the underlying logic and objective. Your task is to craft a detailed summary that encapsulates the code's behavior, pinpointing any malicious functionality. Start with a verdict (Benign or Malicious), then a list of activities including a list of IOCs if any URLs, created files, registry entries, mutex, network activity, etc.
+[attached decompiled.c.txt sample file]
D^3CTF 2024 By W&M
OSED Review – OffSec Exploit Developer 2024
Survey Confirms Growing Reliance on SaaS Tools in the Enterprise, Taxing IT Professionals
Department of Commerce Announces New Actions to Implement President Biden’s Executive Order on AI
Baidu Comate:“AI +”让软件研发更高效更安全
警惕新型僵尸Goldoon:一款指令集覆盖最广的零检出率僵尸家族
记录我在腾讯云上部署一个简单静态网站的艰辛
文章封面使用 DALL·E 3 生成
从三月底开始一直比较忙,最近一切尘埃落定,自己在家也休息了几天,这才能做点自己的事情。
由于一些原因 {}">(是的,我要入职腾讯了),我准备将之前部署在 Cloudflare Pages 上的博客,也就是你现在看到的这个站点,迁移到国内腾讯云上。本以为是很简单的一个操作,完全没有必要大费周章地专门写一篇文章来记录,但现实是我在腾讯云上来来回回试了好几个产品,最终才勉强将这整套的持续集成方案给搭起来。
我以前一直是阿里云的忠实用户。但我对阿里云是又爱又恨,没少骂过阿里云残缺的产品功能和听不懂人话的弱智客服。甚至以前在 EFC 上班的时候,路过英国中心楼下想到阿里云就气不打一处来。但即使是这样,阿里云还是全中国排名第一的云,这说明什么?说明其他家的云更是草台班子!
说回腾讯云,我大一的时候,曾在腾讯云上开过学生机,后面毕业了优惠没了也就销毁了。腾讯云给我的第一感觉是他的 UI 做得很舒服,操作反馈颇有点 Azure 的感觉。但除开 UI 之外,产品的功能设计还有很大的提升空间。
我感觉国内做云的,都是先拿类 OpenStack 做一套管控机房物理资源的系统,然后开始卖 ECS 这样的云主机,卖了一阵子后觉得我可以在一台 ECS 上装点数据库软件、监控软件、消息队列中间件等东西,然后单独拆成如 RDS 这样的服务来卖。卖了一阵子后,发现又可以把好多台 ECS 合起来卖 Kubernetes 集群托管,Kubernetes 托管卖了之后又发现可以在上面二开跑点容器卖 Serverless 服务……
就这样在之前的产品的能力上糊一层然后演化成新的产品。
我不好评价这样的做法是对还是错。我认为复用已有能力做新产品前,对于新产品的定位以及将具备的核心功能,必须要想清楚。倘若底层的功能过于局限,或者必要配置项比较“狭窄”,则应该考虑另起炉灶而不是在上面糊一层兼容的 Shim。
Web 应用托管 Webify我一开始是无脑选择腾讯云的 Webify 来部署我的静态页面。从名字就可以看出它是借鉴的 Netlify,产品形态上跟 Netlify、Vercel、Cloudflare Pages 等页面托管产品差不多。
但问题就出在——腾讯云没有将 Webify 作为的一个单独的产品进行研发,它是属于腾讯云 Cloudbase 云开发产品下的一个子功能!这个 Cloudbase 是啥?是一个类似于 LeanCloud 或者 Heroku 一样的东西,用户在上面托管 Serverless 应用,同时使用 Cloudbase 提供的存储、数据库、云函数等功能。
Webify 作为 Cloudbase 产品的一个子功能,复用了 Cloudbase 部署应用时的 CI/CD 工作流。对于 Cloudbase 而言这个 Webify 实例是一个按量计费的 WebifyPackage ”环境“,在控制台上就莫名其妙地将 Cloudbase 的“环境“这个概念集成进了 Webify 产品中,但是这个“环境”是系统创建的,你控制台点进去还会报错说无权限!
在产品计费上,Webify 有自己的一套按月付费的包,包含 CDN 流量、静态存储容量等内容。但这些用量又和底层的 Cloudbase 的用量藕断丝连。以至于我发工单问客服 CDN 流量用完了是怎么计费,他先是说流量用完后直接回原,跟 CDN 服务无关,一会又给我发 CDN 的计费文档,我指出他说得前后矛盾之后,过了一会直接电话打过来跟我解释才讲明白。(我发现现在阿里云和腾讯云的客服水平都变差了,动不动就一个电话过来解释,为啥不能线上消息或者文档说明白?)
但以上种种也都只是控制台操作上有些不合理,让我来试下实际产品怎么样。
首先是 Weblify 不支持 Hugo 站点的自动构建,不像 Cloudflare Pages 或者 Vercel 那样,选择好仓库后能自动推断出技术栈,并补全构建命令。Weblify 只支持常见的 JavaScript 框架编写的项目。
解决的办法也不难,我稍微拐个弯,在 GitHub 上建一个仓库,存放构建好的 Hugo 站点文件即可。只需在原 Hugo 项目的 GitHub Actions 流水线中加条 Hugo 构建并推送到仓库即可。
在 Weblify 上配置 GitHub OAuth 授权后,选择存放构建后静态资源的仓库,直接静态托管该仓库的内容。然后 Webify 构建又报错了……
根据构建日志,我发现这垃圾玩意是把 git pull 下来的仓库内容,打成 ZIP 压缩包,再用 Cloudbase CLI 推送上去,然后这 Cloudbase CLI 不支持推送超过 100MB 的文件!发工单问客服,答曰:
Webify目前限制构建产物的体积在100MB内,建议客户减少部署包的体积。 图片、音视频等大体积的资源,可以使用CLI工具手动上传到环境内的某个固定目录。
哈???我站点超过 100MB 还不能自动构建还得手动上传???本来用 Weblify 就是图个方便,最后还要我自己上传?
没办法,我打算把 CLI 手动上传的步骤放到 GitHub Actions 的工作流里,即 Hugo 构建完后直接上传至 Weblify。搞了半天成功了,结果 Webify 访问网页直接显示 NO ROUTE 报错,且在控制台上也完全没有找到默认主页、404 页面的配置项。我想就算我解决了 NO ROUTE 的问题,后面默认主页和 404 页面配置不了也还是残废,索性直接申请退款,放弃!
回归 COS + CDN那只能回到传统的静态网站部署方案:将静态文件上传至 COS(腾讯云的对象存储),然后前面套个 CDN。
继续改 GitHub Actions 流水线,将构建好的产物上传至 COS Bucket。然后我发现官方提供的 COS Action 就是个 Bug 百出的垃圾!这里我要实名 diss 这个仓库的原作者 mingshun 我不知道你是不是鹅厂的,但我知道你肯定没认真测试过你写的代码!
例如以下代码 TencentCloud/[email protected]#L110:
} while (data.IsTruncated === 'true');这个 IsTruncated 传进来只能是 Boolean 类型的 true 或者 false,你拿他跟一个字符串类型的'true' 强比较,这里恒为 false,导致这个 while 循环永远也跳不出来,一直卡着。我睡一觉醒来后发现我的 Workflow 跑了六个小时,然后被 GitHub 因为超时干掉了。
除了上面的这位原作者,还有 Shirasawa 这位,因为我有朋友也关注了这位老哥,因此我就不喷了。我只能说老哥你多看下 COS SDK 的源码吧,明明就有 accelerate 这个加速域名参数的,你非得自己实现个:
Domain: core.getInput('accelerate') === 'true' ? '{Bucket}.cos.accelerate.myqcloud.com' : undefined,搞得后面不开accelerate 那就是直接 Domain 为 undefined 然后报错。
没办法,鉴于官方的 Actions 质量如此之差,我索性 Fork 改了个自己用:wuhan005/tencent-cos-action。然后我惊讶的发现,从 GitHub Actions 的美国节点,即使走 accelerate 加速域名上传文件到位于上海的 COS Bucket,也是 1-2 秒上传一个文件,我每次部署都要上传 1000+ 个文件,直接大半个小时过去了,这个部署上传的时间是我无法接受的。
CODING那我得想办法让 Hugo 在境内的节点进行构建,然后从境内传到 COS Bucket 中。这次,我盯上了腾讯云自己搞的代码托管平台 CODING,本质上就是个啥都有的缝合怪。
好在他可以添加外部的 GitHub 仓库,并通过 GitHub OAuth 授权后,在仓库中安装 CODING 的 GitHub App,配置 WebHook。GitHub 仓库有新的推送后,触发 CODING 的流水线进行构建。经过数次调试后,最终可用的 CODING 流水线文件内容如下:
pipeline { agent any stages { stage('检出') { steps { checkout([$class: 'GitSCM', branches: [[name: GIT_BUILD_REF]], userRemoteConfigs: [[ url: GIT_REPO_URL, credentialsId: CREDENTIALS_ID ]]]) } } stage('安装 Hugo') { steps { sh 'apt install snapd' sh 'snap install hugo dart-sass' } } stage('构建') { steps { sh 'hugo --minify' } } stage('上传到 COS Bucket') { steps { sh "coscmd config -a ${COS_SECRET_ID} -s ${COS_SECRET_KEY} -b ${COS_BUCKET_NAME} -r ${COS_BUCKET_REGION}" sh "coscmd upload -rfs --delete public/ /" } } stage("刷新 CDN 缓存") { steps { sh "pip install --upgrade tencentcloud-sdk-python" sh "python ./dev/refresh-tencent-cdn.py -i ${COS_SECRET_ID} -k ${COS_SECRET_KEY}" } } } }我在 Hugo 仓库中加了个刷新腾讯云 CDN 缓存的 Python 脚本,上传成功后再执行这个脚本刷新 CDN 缓存。现在完整构建并部署一次的时间大约在 3-4 分钟。
勉强能接受吧,要知道在 Cloudflare Pages 上可是 1-2 分钟就能完成,并且还不需要我自己做这个多的配置!!
说回 CDN 防盗刷费劲周折,我总算是成功的将博客部署到了腾讯云上。
之前迁移至 Cloudflare 的原因是我七牛云和阿里云都因为 CDN 被盗刷,导致一夜之间账单欠了 ¥600+。我也不知道互联网上为什么会有这么多干着这些损人不利己的蠢事的人。
因此在迁移之前,我十分谨慎地调研过腾讯云的 CDN 防盗刷功能,最后的结论是发现他们做得居然还不错,可以说是已经相当尽力了。在 COS 对象存储的「安全管理」菜单下,居然有一个「盗刷风险监测」功能!从各个维度评估了是否有盗刷风险,真的让人眼前一亮!建议阿里云赶紧跟进下。
我总结了下,具体是这几个方面的配置,以及我自己的配置值。
所属产品 配置项 备注 对象存储 COS 存储桶权限 配置为私有读写,授权 CDN 子用户访问,其余公网请求全部 ban 掉 内容分发网络 CDN 防盗链配置 配置白名单 Referer(治标不治本,CC攻击加个头就行) 内容分发网络 CDN IP访问限频配置 10QPS(单个 IP 限制,有一定效果) 内容分发网络 CDN 下行限速配置 全部内容,限速 1024KB/s(这个值我感觉还可以再低,防止被刷流量) 内容分发网络 CDN 用量封顶配置 流量每五分钟瞬时用量超过 2GB、HTTPS 请求数每五分钟超过 100 万次、当天 24 点前累计流量超过 10GB。(触发后会直接停掉 CDN 服务,防止一觉醒来账单爆炸)以上配置是否能真的防住 CC 攻击,还得看腾讯云的用量封顶配置多久生效。虽然官方说是 10 分钟左右,这个时间我觉得还是有些长,万一对面 10 分钟打出了 1 TB 流量呢?但同时腾讯云官方又给出了一种通过定时 Serverless 函数,请求腾讯云 API 检测 CDN 用量,超过用量后使用 API 关闭 CDN 服务的方法。由于是自建 Serverless 定时函数,时间周期可以设的更短,这个后续我可以尝试下。
最后说几句后续我可能会把阿里云集群上的业务也迁到腾讯云上来。
最近一个多月以来自己得睡眠质量不是很好,总是忧心忡忡。好在现在都已尘埃落定,我如愿拿到了腾讯的 Offer,自己这波“金三银四”还算顺利。这过程中的怀疑、悔恨、不甘,现在回想起来也都不重要了。
站在人生的又一个起点,我还依旧觉得没什么实感。对于后面匆匆收拾东西,搬去上海,我也不确定自己是否准备好了。但我可以肯定的是,自己已经跳出了原来的舒适圈,面前的是另一个更舒适的舒适圈还是更艰难的挑战,这还尚不可知。