[GitHub Trending] NanmiCoder/MediaCrawler
5.4 relevance
Score Breakdown
technical depth 6
novelty 5
actionability 6
community 6
strategic 4
personal 4
Scored daily by a customisable AI persona to surface the most relevant engineering leadership news.
Social media crawler, tangentially related to data engineering but not a core interest.
Summary
MediaCrawler is an open-source tool for scraping public data from Chinese social media platforms including Xiaohongshu, Douyin, Kuaishou, Bilibili, Weibo, Tieba, and Zhihu. It leverages Playwright for browser automation to capture login states and bypass JS reverse engineering, supporting features like keyword search, post-level crawling, comment extraction, and IP proxy pools. A Pro version adds AI agent-based content extraction, resume-on-interrupt, multi-account support, and removes Playwright dependency for simpler deployment.
Author
NanmiCoder