Estimating consumer impressions of a product’s appearance is essential. However, this is not easy because of the variety in consumers’ tastes and differences in how consumers and designers experience design. Multimodal foundation models trained on datasets from the internet could be applicable for the estimation; however, it remains unclear if the models’ tastes are similar to those of consumers or experts like designers. Therefore, we conducted surveys in which consumers and designers rated the appearance of car wheels. In addition, a foundation model estimated the visual impression of the wheels. The model’s ratings were more similar to those provided by designers than consumers. Therefore, the models could have tastes similar to those of experts because the datasets could contain advertisements and reviews written by experts or product owners who have opinions on product appearance.