boredhackerblog: Building a honeypot network with inetsim, suricata, vector.dev, and appsmith

I wanted to learn a bit more about data engineering, databases, app building, managing systems, and so on so I decided to work on a small honeypot network as a project. I was partially inspired by Greynoise and AbuseIPDB, I use both of those a lot. I wanted to get this project done in about a week so this is a small project which isn't too scalable. I ended up learning things so it's fine.

My goals:

- Use Suricata to see what type of signatures are triggered based on the incoming traffic from the internet

- Save all the Suricata logs to disk in a central place so I can go back and search all the data or reingest the data.

- Send logs to Humio for searching, dashboarding, and potentially alerting purposes

- Have a webapp for searching for an IP

-- Webapp should show the signatures the IP has triggered, first time the IP was seen, last time the IP was seen, and number of times it was seen triggering signatures.

My tech stack:

- Sensors & databases are hosted on Vultr w/ Ubuntu

- Obviously Suricata for detecting attack attempt type

- Inetsim - this is not the best (i'm letting the attackers know I'm not running any real services, it's just inetsim, assuming attackers manually go look at the scan results) but it'll do for this project

- Zerotier - all sensors are connected to a zerotier network, it just makes networking, moving data around, and management easier

- Vector.dev - I'm using vector.dev to move data around

- Humio - it's for log storage and search, just like ELK or Splunk

- rinetd - I'm actually not running inetsim on all the sensors, I'm just forwarding all the traffic from sensors to one host running inetsim (it's good enough for this project)

- Redis - pubsub. I'm putting alerts into redis and letting python grab them and put the data in postgresql

- Postgresql - to store malicious IP, signature, and timestamp

- Appsmith - to make webui app (usually i'd use flask...)

Networking:

Network kinda looks like this w/ Zerotier:

Sensors are exposed to the internet, servers aren't. rinetd takes in sensor traffic from the internet and forwards it to inetsim. inetsim is bound to zerotier IP address.

Configuration for rinetd: https://github.com/BoredHackerBlog/dumbhoneypot/blob/main/rinetd.conf

Logging:

The flow for logs kinda looks like this:

Vector on all the sensors reads eve.json, sends the data to vector on the ingest server.

Vector on the ingest server does multiple things. It'll save data to disk, send the data to humio, the alerts will get geoip info added, then it'll go to redis, python will ingest data from redis then put it into postgres.

postgres stores malicious IP, suricata signature, and timestamp.

Sensor vector config: https://github.com/BoredHackerBlog/dumbhoneypot/blob/main/sensor_vector.toml

Server vector config: https://github.com/BoredHackerBlog/dumbhoneypot/blob/main/server_vector.toml

Python script being used to process redis data and add data to postgres: https://github.com/BoredHackerBlog/dumbhoneypot/blob/main/process_redis.py