Categories
Mastering Development

Write an executor service to process in a batch of 100 records

private void startProcess() throws Exception {

        ListObjectsV2Request listObjectsV2Request = new ListObjectsV2Request();
        listObjectsV2Request.setBucketName(bucket);
        listObjectsV2Request.setPrefix(prefix);
        ListObjectsV2Result listObjectsV2Result;

        do {
            listObjectsV2Result = s3Client.listObjectsV2(listObjectsV2Request);
            execute(listObjectsV2Result.getObjectSummaries());
            listObjectsV2Request.setContinuationToken(listObjectsV2Result.getNextContinuationToken());
        } while(listObjectsV2Result.isTruncated());

    }


    public void execute(List<S3ObjectSummary> s3ObjectSummaries) throws Exception{
        List<List<S3ObjectSummary>> results = partition(s3ObjectSummaries, 100);

        List<Future<?>> futureResults = new ArrayList<>();
        for (List<S3ObjectSummary> summaries : results) {
            futureResults.add(executor.submit(() -> {
                process(summaries);

                return true;
            }));
        }
        for (Future<?> asyncResult : futureResults) {
            asyncResult.get();
        }
    }

    private List<List<S3ObjectSummary>> partition(List<S3ObjectSummary> list, Integer partitionSize) {
        int numberOfLists = BigDecimal.valueOf(list.size())
                .divide(BigDecimal.valueOf(partitionSize), 0, CEILING)
                .intValue();

        return IntStream.range(0, numberOfLists)
                .mapToObj(it -> list.subList(it * partitionSize, Math.min((it+1) * partitionSize, list.size())))
                .collect(Collectors.toList());
    }

The above code fetches 1000 records for every next token. I wanted to process 1000 records by splitting into batches of 100 and every 100 records being submitted as a task to the executor thread service. So there should be 10 thread pools and each pool executing 100 records asynchronously. Is there any example that could help my scenario? Should I iterate 1000 records and put them in a map first. For each 100th record in the iteration, I submit the map to the executor service and flush the map for the next iteration onwards? Is it the right approach? any suggestions?

Leave a Reply

Your email address will not be published. Required fields are marked *