Sunday, December 12, 2021

notes/links about log collection, storage, and searching


Just some notes about log collection, storage, and searching.

I just want to be able to store some log data for a long time and do searches on it later in the future, once in a while. I'm not trying to produce a report with the data or do alerting or transport the logs securely.

One of my use cases is collecting network data and storing that for a long time and maybe searching for a specific domain or IP in the future that could've been related to a security incident. 

Similar for incoming http traffic. I'd like to see if someone tried to access a specific URI a really long time ago. (maybe when vuln related to that URI wasn't public at the time)

(leaving out elasticsearch-based things, splunk, and cloud-based services)

notes/links should help w/ research if anyone else is trying to do the same thing as me

Gathering & shipping logs:

For Windows Event Logs:

- fluentbit -

- fluentd -

- nxlog -

- winlogbeat -

- promtail -

- Windows event forwarding - WEF sends logs from all the hosts to one collector host

For other text file based logs (linux, webapp, etc..)

- all the tools above

- vector -

- filebeat -

- rsyslog -

- syslog-ng -

- logstash -

some of the tools listed above can take in forwarded events (syslog, logtash/beats, etc) from other products and tools as well. 

- kafka - another option for just getting logs from various sources and forwarding them to some other place

input/output, sources/sinks:

- kafka -

- vector -

- fluentbit -

- fluentd -

- logstash -

- rsyslog -

Log processing:

You may want to process the data to drop certain events or append data to some events. For example, for network data, you may want to use a filter that adds geoip info. You may also want to rename fields.

Many of the collectors and shippers listed above already have some ability to modify or parse the log data. 

Some of the tools are calling these plugins/modules filter or processing or transformer. You may also be able to write your own plugins or some code (some tools above support Lua) to change the logs before output part happens.

Depending on the type of processing you may want to do, you may need to output the logs into a different format that your application understands then process it and put it back into the pipeline for the next step or storage.

For kafka, I found faust ( but there are other libraries too for python and other langs.

Log storage:

The output part in almost all the tools listed above can send data to various places where logs can be index and/or stored. 

You can always store logs to disk on one host w/ compression (obviously searching this is not very fun). Files can also be stored in the cloud. Everything pretty much has s3 output support.

For files stored on disk, many of the tools will allow you to select format such as text, json, etc..

Tools such as logrotate can be used to move, compress, or delete the logs (

cron job/scheduled tasks and some scripts can always be used to move, compress, or delete files as well. 

For being able to easily store and search logs, there is Grafana Loki -

Grafana Loki is somewhat similar to elasticsearch or splunk and you can use Grafana webui to query the data.

While doing more research, I came across clickhouse (which is also supported by some of the tools above) ( Clickhouse can store json data and you can do sql queries on that data. 

I also came across cloki, which is using clickhouse but emulating loki (

The backend is a clickhouse database and you push logs into loki emulator, just like you'd push logs into loki. cloki also supports the same query language as loki and will work with grafana loki connector.

Log search:

Searching the logs depends on how they're stored obviously. For uncompressed or compressed logs, tools such as grep or zgrep or ripgrep ( can be used for searching.

On Windows, there are a few tools that can be used to search and/or query logs. Fileseek ( can be used to search a bunch of files. There is Logfusion ( as well which can be used to read log files.

There is also Log Parser Lizard ( which can be used to query log files and even save queries and produce charts or reports.

Files can also be loaded into python w/ pandas for searches, complex searches, or statistical analysis. Pandas supports loading various file types. (

Finally, if you end up using loki or cloki, grafana can be used to do queries. Grafana also has connectors/plugins for other database/log storage systems. 

Sample logs:

To play with any of the tools above without making changes in production env, you can use sample logs or data sources. - github repo that links to several sample logs - logs related to security. there are some network traffic logs in there - EDGAR log files - various log files - log generator (various types) - log generator - certificate transparency logs / - If you want to grab MQTT demo data. I'm pretty sure people are using this for free for their projects too...

ps: i'm not an engineer or an observability expert. Implementation of various tools above varies and may have impact on resource usage.

Friday, November 26, 2021

Collecting Unifi logs with Vector and Grafana Loki


This post just discusses sending unifi logs to grafana loki and utilizing agent.

Typically for log collection I would utilize something like Beats (filebeat, winlogbeat) and Logstash. Logstash unfortunately, in my experience, uses too much memory and CPU resources so I decided to search for an alternative. I came across, fluentd, and fluentbit. seemed to be easy to install, configure, and use so I decided to give that a try.  

For log storage and search, I would normally use Elasticsearch & Kibana, Opensearch, Graylog, or Humio. Humio would be hosted in the cloud and anything that's Elasticsearch or Elasticsearch-based would also require too much memory and CPU resources. I found Grafana Loki and decided to try that. It seems relatively lightweight for my needs and runs locally. Also I saw a Techno Tim video on Loki recently.

Logs will be stored with Loki and I'll use Grafana to connect to Loki and use it to query and display the data.

Vector and Grafana Loki will be running on a NUC w/ Celeron CPU w/ 4GB RAM so having something that runs on Pi (grafana has an article where they run grafana loki on a pi) is nice.


Unifi controller has an option to send logs to a remote system so that's what I'll be using to send logs. It will send syslog (udp) to an IP address. 

Vector has sources, transforms, and sinks. Source is input/data source, transforms can apply various operations to the data, such as filtering or renaming fields, and sink is basically the output. I will be just using source and sink. Source in this case will be syslog. Vector will listen on a port for syslog messages. Sink will be Loki since that's where the logs will be stored.

I'll have one VM running vector and the same VM will be running Grafana UI and Loki using docker-compose.

Unifi Controller Syslog -> (syslog source) Vector (Loki sink) -> Loki <- Grafana WebUI

I am not using doing any encryption in transit or using authentication for loki, it is an option.


I have an Ubuntu 20.04 server w/ docker and docker-compose installed.

Grafana Loki

Grafana docker tutorial shows how to set up grafana loki with docker-compose: 

I removed promtail container from my configuration.

Here's the configuration I'm using:

Create a new loki folder and grafana folder as docker will mount.

Download and place it in loki folder and rename the file to local-config.yaml. Change the configuration if needed.

No need to download and place anything in the grafana folder.

Run docker-compose up -d to start grafana and loki.

Grafana webui is running on port 3000 and default creds are admin/admin.

Go to configuration and add loki as the data source. docker-compose file refers to that container as loki so it'll be at http://loki:3100.


Now Vector needs to be setup.

I'm setting it up by just following their quickstart guide.

I ran: curl --proto '=https' --tlsv1.2 -sSf | bash

Default config file is located at ~/.vector/config/vector.toml

Here's my config for syslog source and loki sink:

I modified the syslog port to be 1514 so I can vector as a non-privileged user and I also changed mode to udp.

For loki sink, label is required but your label key value can be anything you prefer. I could have done labels.system = "unifi" and it would work just fine.

Once configuration is done, the following command can be ran to start vector: vector --config ~/.vector/config/vector.toml

Unifi controller

In unifi controller settings, remote logging option is under Settings -> System -> Application Configuration -> Remote Logging

Here's what my configuration looks like:

Click Apply to apply changes and the logs should flow to vector and into loki.


no logs in grafana query

I did have a weird issue where logs didnt show up in grafana query but would show up when i do live query.

I ran "sudo timedatectl set-timezone America/New_York" to update my timezone and that fixed the issue. (or it didn't but i think it did because queries did show results after i ran this)


Saturday, April 10, 2021

Creating a malware sandbox for sysmon and windows event logs with virtualbox and vmexec


I was doing some research around detection related to maldoc/initial access. Usually, I've seen malicious Word or Excel documents and in some cases compressed files containing Word document, Excel document, script, or an executable. In a lot of cases LOLBIN/LOLBAS are abused. You can see this this happening a lot of sandbox (anyrun, VT dynamic, hatching triage, etc..) outputs as well.

I came across some guidance around blocking some LOLBIN/LOLBAS files with Windows Firewall to prevent some of the initial compromise activity. There multiple scripts and blog posts related to this. Essentially, Windows Firewall rules are added to prevent some of the executables from connecting to the internet.


I also saw posts where Olaf Hartong was discussing data from sandbox related to malware and LOLBIN/LOLBAS usage and rundll32 as well.

I thought it would be interesting to collect data on my own and have my own dataset to play with. I also wanted the ability to test malware in an environment where some hardening was applied, such as mentioned in the blog posts and scripts above. In addition to that, I wanted to have the ability to have an EDR agent or AV agent in the same sandbox to see what it collects or alerts on in it's management console. I ended up writing vmexec to help me with this.

vmexec is similar to cuckoo sandbox and cape sandbox but it doesn't get any information back from the VM's. It just puts the executable in the VM and executes it. When you upload the sample, you can pick a VM or use any available VM and set how long the VM will run for after the sample is uploaded. It uses virtualbox for VM's and just like cuckoo or cape, you need to have an agent inside the VM.


I'll be using Windows 10 VM with various logging enabled and sysmon installed. I'm using sysmon-module rules ( 

For forwarding logs, I'll be using winlogbeat OSS. ( I'm using OSS version because I'll be using Opendistro for elasticsearch elastic and kibana containers. (

Since I'll be running malware, I'll have to have a second VM for routing the malicious traffic but it's not required if you're okay with threat actors potentially seeing your connections. You can always set up the sandbox VM in a way it doesn't route any traffic as well.

The network and VM design kinda looks like this:


Getting all the packages and dependencies:

  1. Install Ubuntu 20.04 (although pretty much any Linux OS should work)
  2. Install Docker (
  3. Install docker-compose (
  4. Install Virtualbox (
  5. Make sure python3 and python3-pip are installed
    1. Might have to run apt install python3 python3-pip
  6. Install python packages
    1. Run the commands below:
      1. pip3 install flask
      2. pip3 install flask-sqlalchemy
      3. pip3 install flask-admin
  7. Download vmexec
    1. if you have git installed you can run:
      1. git clone

Getting Elastic and Kibana up and running:

I'm using a docker-compose file for elastic and kibana. 

research@workstation13:~/elk$ cat docker-compose.yml

version: '3'



    image: amazon/opendistro-for-elasticsearch:1.13.1

    container_name: odfe-node1


      - discovery.type=single-node

      - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping

      - "ES_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM



        soft: -1

        hard: -1


        soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems

        hard: 65536


      - odfe-data1:/usr/share/elasticsearch/data


      - 9200:9200


      - odfe-net


    image: amazon/opendistro-for-elasticsearch-kibana:1.13.1

    container_name: odfe-kibana


      - 5601:5601


      - "5601"


      ELASTICSEARCH_URL: https://odfe-node1:9200

      ELASTICSEARCH_HOSTS: https://odfe-node1:9200


      - odfe-net





In the docker-compose.yml file shown above, the data is being stored in odfe-data1 volume. When you take down the containers and bring them up again, the data will not go away. 

Additional information about opendistro for elastic docker container and settings can be found here:

Cd into the directory that contains the docker-compose.yml file and run docker-compose up -d to start containers in the background. To take down the containers, you can run docker-compose down from the same directory.

Once you bring up the containers, elastic will be running on port 9200 and kibana will be on 5601.

Setting up Windows 10 Sandbox

  1. Create a Windows 10 VM in virtualbox
  2. Disable updates
  3. Disable antivirus
  4. Disable UAC
  5. Disable anything else that's not needed
  6. Install whatever applications you need, such as a pdf reader or Office
    1. If you're using Office (Word or Excel), ensure to allow macros to run automatically (
  7. Install Python 3+
  8. Copy from vmexec project into the VM (do not run it yet)
These should help with disabling of some things:

Setting up logging and log forwarding:
  1. Download sysmon and install Sysmon with sysmon-module rules (see the loggingstuff.bat link above)
  2. Enable process auditing and powershell logging (
  3. Download and install winlogbeat oss
    1. configure winlogbeat oss to forward logs to, which is where elastic will be running once we create host-only adapter

After the base VM is setup, there are some network modifications that are needed.

You will need to create a host-only adapter without dhcp server enabled.

Enable the second NIC on the VM and attach it to host-only adapter.

Set the first NIC/adapter to NAT or internal network or whatever else. I have mine setup to internal network going to my router.

Finally, turn on the VM, set a static IP for the adapter in Windows. Since my vboxnet0 host-only adapter is using I set my IP to

Reboot the VM, login and run and take a snapshot while the VM is running. Note the IP address, snapshot name, and VM name.

Setting up vmexec
in, just search for #CHANGEME and modify the settings there.

You'll want to add your VM like this:

db.session.add(VMStatus(name="winVM",ip="",snapshot="Snapshot2", available=True))

name is the name you gave your VM in virtualbox, IP is the static IP that was assigned, and snapshot is the snapshot you're utilizing.


To start using vmexec, you need the docker containers for elastic and kibana running (cd into the directory with your docker-compose.yml file and type docker-compose up -d), you need your router VM up and running. You can just start the VM. Finally, you need to start vmexec. cd into the vmexec directory and type flask run -h (if you want to remotely access the web server) the web server will be running on port 5000.

the webui looks like this:

You can select and upload a file, select a specific VM from the dropdown menu (optional), and change the VM run time and click the submit button.

You can access kibana on port 5601 via web browser. Make sure to setup your index pattern. It should be winlogbeat-*.

In kibana you can search for the executable file that was ran and look at surrounding events. With sysmon-modular rules, you can also match events with mitre framework.

Modifying the project

Modifying the project is easy depending on your needs. can be modified easily if you would like to upload files to specific location or execute/open them in a certain way. There could be code added in vm_process function as well if additional steps need to be taken before running the VM or the file or after.


Saturday, January 30, 2021

Creating an Active Directory (AD) lab for log-based detection research and development with Vagrant, Humio, and AtomicRedTeam


Few years or months ago, I came across DetectionLab project and thought it was neat. It would let me conduct attacks and let me work on detection rules and also let me test detection rules. DetectionLab uses Splunk for storing logs which I'm not used to and it also requires a lot of system resources my machine doesn't have. 

I then came across DetectionLabELK, which is similar to DetectionLab but uses ELK stack, which I am familiar with but I have the same issue with system requirements and not needing some of the components of the project. DetectionLabELK people (CyberDefenders) provide a cloud version of it which is very cheap if you wanted to utilize it for testing things but I still wanted to have something on my own machine.

I did build an AD lab manually, however, after not taking snapshots and breaking the lab, I decided that I should just use Vagrant.

For my lab needs, I just need to look at logs and not network traffic. I also just need one DC, one Workstation, and a Kali VM. I'm very familiar with using Humio so I decided to use Humio cloud (free) account to store and search my logs. Kali is good for doing certain attacks but I also wanted AtomicRedTeam so I could use that for generating log data and testing queries. The AD lab I made was also inspired by Applied Purple Teaming course and TheCyberMentor ethical hacking course.


Domain: testlab.local
Computers: dc1 - - windows server 2019 desktop
workstation1 - - windows 10
kali - no IP initially, you have to set it to - kali linux

local user: vagrant / vagrant works on all machines
domain users: 
jsmith / Password123
jdoe / 123Password
SQLService / Servicepass123

all domain users are in domain admins group, administrators group, and enterprise admins group.

jsmith is a local admin on workstation1


system requirements:
any modern 4 core 8 thread CPU should be fine. I'm using i7-6700HQ.
around 16GB of RAM should work fine as well.

virtualbox download and installation:
Download and install virtualbox from here:
Install Oracle VM VirtualBox Extension Pack as well.

vagrant download and installation:
Download and install vagrant from here:
Once vagrant is installed, open command line and run: "vagrant plugin install vagrant-reload" to install the reload plugin. More info here:

downloading the github project:
Download the zip and unzip it or run git clone

setting up humio:
Get a Humio account and login at
Create a new token for this project. You can leave the parser as None. Copy the token.
Edit winlogbeat.yml file and change the password to your token.


Vagrant command line guide:

Open command prompt and cd into the LogDetectionLab folder.
Type vagrant up to bring up all 3 virtual machines.
Your initial run will download the VM boxes and set everything up. This may take 30 minutes to an hour. 

Once all the machines are up and running and vagrant command exits in command prompt, you will need to login into kali linux VM and change eth1 IP to

You will have to disable Defender on workstation1 and install invoke-atomicredteam manually (check github page for bugs).

For using invoke-atomicredteam, you will need to open powershell and run: Import-Module "C:\AtomicRedTeam\invoke-atomicredteam\Invoke-AtomicRedTeam.psd1" -Force

You can also do vagrant up MACHINENAME, such as vagrant up dc1.

To tear down the lab, you need to run vagrant destroy -f. This will shutdown the VMs and remove them.

Vagrant also supports making snapshots and you can read more about that here:

modifying the project

Vagrantfile - this can be changed to modify VM cpu and memory resources, how port forwarding works, hostname, ip address, and scripts that run.

install-dc.ps1 - domain controller promotion script

join-domain.ps1 - joins the computer to the domain and adds jsmith as a local admin

create-users.ps1 - creates users on the dc

create-smbshare.ps1 - create an smb share on the dc

change_ui.ps1 - changes some Windows setting so ui is adjusted to best performance

change_sec_config.bat - disable updates, disable firewall, disable defender, disable uac, and enable rdp

install-atomicredteam.ps1 - installs invoke-atomicredteam

enable_logging.bat - enables a bunch of logging stuff, installs sysmon with olafhartong config, and downloads winlogbeat

winlogbeat.yml - winlogbeat config file, you'll have to edit this to change where the logs go also as you start seeing event id's that are not useful, you can just edit this to remove them or modify enable_logging.bat to avoid enabling certain events.

setup_winlogbeat.bat - sets up winlogbeat


I kept getting errors after I promoted the domain controller then tried to reboot. Errors were related to winrm. I added 
  config.winrm.transport = :plaintext
  config.winrm.basic_auth_only = true


executed "reg add HKEY_LOCAL_MACHINE\\Software\\Microsoft\\Windows\\CurrentVersion\\Policies\\System /v EnableLUA /d 0 /t REG_DWORD /f /reg:64" before promoting and that seemed to fix this issue.

At the time of posting this blog post, I'm having an issue with workstation1 not installing atomicredteam correctly. AV doesn't get turned off for some reason.

I can't change IP address on kali through vagrant. 

me typing vagrant destory -f for 10 minutes trying to figure out why it didn't work was also challenging.