Introduction
When working with JSON data in JQ, you may come across the need to filter and exclude specific values based on certain conditions. In this blog post, we will focus on filtering JSON in JQ to exclude values that do not meet a specific criterion. Specifically, we will explore a case where we want to filter a JSON object to exclude values that contain a certain substring. Let’s dive into the solution!
The Problem
Suppose you have a JSON object and you want to filter out values that contain a specific substring. In this case, you want to filter all the imageTags
that do not end with the string “latest”. The initial query you tried filtered the values containing “latest” instead of excluding them, which is not what you desired. Here’s the original query:
code
aws ecr describe-images --repository-name <repo> --output json | jq '.[]' | jq '.[]' | jq "select ((.imagePushedAt < 14893094695) and (.imageTags[] | contains(\"latest\")))"
The Solution
To achieve the desired result and filter out values that contain the “latest” substring, you can modify the query as follows:
code
aws ecr describe-images --repository-name <repo> --output json | jq '.[]' | jq '.[]' | jq "select (.imageTags[] | contains(\"latest\") | not)"
In this modified query, we use the contains()
function to check if the imageTags
contain the substring “latest”. By applying the not
operator, we reverse the logic and select only the values that do not contain “latest”. This ensures that the filtered JSON object only includes the desired values.
Using Index Function
Instead of using the contains()
function, you can utilize the index()
function in JQ to check if a string does not contain another string. Here’s how you can modify the query:
code
aws ecr describe-images --repository-name <repo> --output json | jq '.[]' | jq '.[]' | jq "select (.imageTags | index(\"latest\") | not)"
In this approach, the index()
function returns the index of the substring “latest” in the imageTags
array. By applying the not
operator, we select only the values where the substring is not present, effectively excluding the values that contain “latest”.
Simplifying the Pipeline
You can simplify the pipeline by combining the JQ operations into a single call. Here’s an example:
code
aws ecr describe-images --repository-name <repo> --output json | jq '.[][] | select (.imageTags | contains("latest") | not)'
In this simplified pipeline, the .[][]
syntax flattens the nested arrays into a single array, allowing you to apply the filter directly to the values. The rest of the query remains the same, filtering out the values that contain “latest”.
Using Any or All Functions
If you have an array of strings and want to filter out values that contain a specific substring, you can use the any()
or all()
functions in JQ. Here’s an example:
code
echo '["foobar","barbaz"] ' | jq 'map(select(. | any(contains("foo") | not)))'
In this case, the any()
function checks if any of the strings in the array do not contain the substring “foo”. The select()
function then filters out the values that meet this criterion, effectively excluding them from the resulting JSON array.
Conclusion
By exploring these alternative approaches, you have more options for filtering JSON in JQ to exclude specific values. Whether you use the index()
function, simplify the pipeline, or leverage the any()
or all()
functions, you can achieve the desired outcome and customize the filtering process according to your needs.
Remember to consider the structure of your JSON data and adapt the queries accordingly. With these techniques, you have the flexibility to filter and manipulate JSON in JQ to exclude specific values and achieve your desired results.