top of page

Implementing Sweep Algorithm for Constrained Clustering in R

Writer's picture: Shibaprasad BhattacharyaShibaprasad Bhattacharya

Constrained-based clustering is a unique way of clustering where not only we take into account the similarity of the points but also the specific constraints which they might have or we manually impose.


Sweep algorithm is a popular algorithm used for constrained clustering and vehicle routing problems. The basic idea here is that we will 'sweep' the destinations with their similarity and cluster them together. The basic objective here is something like that:



Here we can see the depot will serve each cluster. Now, to implement it in real life, I have made a modification. Not only each point will be under a cluster like the above image but, each cluster will have a specific capacity that it will not exceed. This may become useful if someone is planning to use a single vehicle for a single cluster.


First, I have ordered the points in a clockwise manner with respect to the center. After that, the cluster has been formed with respect to their demands. The R code for implementing it is as below. Let's first create a dummy data frame and load the tidyverse package.



library(tidyverse)

set.seed(123)
id<- seq(1:600)
lon <- rnorm(600, 88.5, 0.125)
lat <- rnorm(600, 22.4, 0.15)
demand <- round(rnorm(600, 40, 20))

df<- data.frame(id, lon, lat, demand)

Now let's create a cluster function directly. It will do both the sorting and clustering:

constrained_cluster <- function(df,truck_load=170, lat_median=median(df$lat), lon_median=median(df$lon)){
  df$angle = atan2(df$lat - median(df$lat),
                   df$lon - median(df$lon))
  df<- df[order(df$angle, decreasing = TRUE),]
  d<-0
  cluster_number<-1
  cluster_list<- c()
  i<-1
  while (i <= length(df$angle)){
    d <- d+ df$demand[i]
    if(d<=truck_load){
      cluster_list[i] <- cluster_number
      i<- i+1
    }
    else{
      cluster_number <- cluster_number+1
      d <- 0
      i<-i
    }
  }
  return(cbind(df,as.data.frame(cluster_list)))
}

Next step, let's call the function and visualize the clusters. Let's see if they resemble the initial picture:

df_with_cluster<- constrained_cluster(df)
ggplot(data=df_with_cluster) +
  geom_point(aes(x=lon, y=lat, colour=factor(cluster_list)))+
  theme(legend.position = 'none')

Voila! It is done!

373 views0 comments

Comments


bottom of page