Modeling relationships between retail prices and consumer reviews: A machine discovery approach and comprehensive evaluations

https://doi.org/10.1016/j.dss.2021.113536Get rights and content

Highlights

  • A novel data-driven Generate/Test Cycle was designed to automatically discover feasible models.

  • A Monte Carlo simulation was performed to validate the designed approach.

  • Models were built to describe relationships between retail prices and reviews for one product at the individual level.

  • A guided map was offered by using the comprehensive evaluations of the candidate models.

Abstract

Setting the retail price as a part of marketing would affect customers' cognition regarding products and affect their post-purchase behavior of review writing. To deeply understand the relationships between retail prices and reviews, this paper designs an intelligent data-driven Generate/Test Cycle using a machine learning technique to automatically discover the relationship model from a huge amount of data without a prior hypothesis. From a unique dataset, various free-form relationship models with their own structures and parameters have been discovered. By the comprehensive evaluations of candidate models, a guided map was offered to understand the relationship between dynamic retail prices and the volume/valence of reviews for different types of products. Experimental results show that 37.69% of products in our sample exhibit the following trend: When the price is increased to a certain level, the volume of reviews shifts from a decreasing trend to an increasing trend. Results also demonstrate that a linearly increasing relationship model between prices and the valence of reviews is more suitable for the low-involvement products than for the high-involvement products. In addition to the new findings, this research provides a powerful tool to assist domain experts in building relationship models for decision making in a highly efficient manner.

Introduction

Word-of-mouth (WOM) plays a significant role in influencing consumers' attitudes and purchase decisions [1,2]. The popularity of online retailer websites enables consumers to post reviews on products and services, which act as a good proxy for electronic WOM with a high degree of credibility [3]. A survey conducted by Dimensional showed that an overwhelming 90% of respondents, who recall reading online reviews, claim that positive online reviews influence their buying decisions [4]. A report from Harvard Business Review found that a one-star increase in Yelp rating led to a 5%–9% increase in revenue [5]. These data suggest that consumer reviews have a significant impact on others' purchase decisions and retail companies' revenues.

Marketing tools price is an important decision variable in marketing for a product and can affect customers' cognition, feelings [6], purchase decisions, and post-purchase satisfaction [7]. Previous research also found that the price could affect the consumer reviews [8,9]. The online retailers are able to adjust their prices more frequently and easily compared to physical retail stores. A survey estimated that Amazon changes retail prices more than 2.5 million times daily for its millions of products.1

In this vein, a fundamentally important question to ask is as follows: What effects can be observed regarding volume and valence of consumer reviews after increasing or decreasing the retail price for a specific product? In this study, volume measures the total amount of reviews posted on a product and is an important cue for product popularity [10]. Valence (i.e., star rating) captures the positive or negative nature of reviews, which contains evaluation information on product quality. To answer this question, computable models for describing relationships between prices and volume/valence of reviews should be built.

Previous researcher have found that the price has a direct or indirect effect on consumer reviews because of intermediate factors, such as sales, satisfaction [11], loyalty [12,13], and biased acquisition [14]. However, these intermediate factors exist concurrently, making them difficult to disentangle. An open problem remains: Relationships between retail prices and consumer reviews are unclear. A handful of studies have attempted to understand the relationship between market price and consumer reviews. Chen et al. (2011) [8] found a U-shaped relationship between market price and volume of reviews and no significant relationship between market price and valence of reviews at mature stages of Internet use based on the automobile-model data, and Li and Hitt (2010) [9] found that market average price has a significant negative effect on the valence of reviews based on the digital camera data.

These studies have been performed at the macro (market) level by employing market price,2 in other words, a constant average of market prices for all products at an aggregate level. Moreover, in that research, consumer reviews and prices were collected from different websites, and the price was not the real transactional price. As a result, their findings might not help understand the nuances of relationships between product prices and consumer reviews. A variety of information sources are very coarse in revealing the relationships between prices and reviews because each review has not been associated with the price at which the consumer bought the product. Because the transaction price corresponding to a review is difficult to obtain, relationship searching is difficult.

Fortunately, we obtained a unique data set from an online retailer that comprised 321 types of products with retail prices and corresponding reviews. According to statistics, prices changed 5431 times during the period of data collection, and 1,738,114 pieces of reviews were crawled in the same period. For model building, the traditional paradigm often depends on a Generate/Test Cycle [15,16]. Such cycles begin with observations of the data, and then hypotheses are generated and tested against the data. Eventually, promising models are produced. This paper designs a new data-driven Generate/Test Cycle to automatically discover relationship models by using machine learning techniques. Unlike the traditional paradigm by domain experts to generate alternatives and test them against constraints, the proposed approach develops a mechanism to automatically learn the structures and parameters of models from data, without prior hypothetical forms provided by domain experts. The key is an intelligent model discovery method based on genetic programming (GP) [17], which has demonstrated its capability successfully in various fields to discover functions, for example, relationship functions [18,19] and ranking functions [20,21].

Thus, in this paper, the GP method as an intelligent tool is introduced to exploit functional relationships between retail prices and consumer reviews from a large and unique data set. Experimental results show that for the relationships between retail prices and volume of reviews, three types of models demonstrate the best performance: the linearly decreasing, asymmetric U-shaped, and asymmetric inverted U-shaped model. For the relationships between retail prices and valence of reviews, the promising models are the linearly decreasing, asymmetric inverted U-shaped, and linearly increasing model.

Nevertheless, none of the models dominate all the others on the basis of three evaluation metrics: fitness, complexity, and coverage. For example, for the relationships between retail prices and volume of reviews, the linearly decreasing models feature high coverage, low complexity, and low fitness, whereas the asymmetric U-shaped model features low coverage, high complexity, and high fitness. Instead of simply suggesting on model, comprehensive evaluations have been conducted to examine the performance of each candidate model in various categories of products to show its comparative advantages and disadvantages. The experimental results provide detailed references for the application of relationship models, such as which model is more suitable for a product or how to choose another model to complement this model when it does not model the relationship under a certain metric.

The research in this paper makes the following contributions.

  • A novel data-driven Generate/Test Cycle using a machine learning technique, namely, GP, was designed to automatically discover feasible models to express the relationships between prices and reviews. This type of research demonstrates an alternative modeling method in information systems research that could greatly improve modeling efficiency.

  • Research on the impact of vendor marketing strategies on user-generated content (UGC) was extended. More specifically, based on the unique datasets, our research is the first to model the relationships between retail prices and consumer reviews for one product at the individual level. It could be more useful for marketers to perform marketing activities precisely and cost-effectively regarding the product.

  • Empirical findings and a guided map were offered to understand the relationship between dynamic retail prices and the volume/valence of consumer reviews by using the comprehensive evaluations of the candidate models. This method advances empirical research in information system field.

The remainder of the paper is organized as follows. In Section 2, we provide related research. In Section 3, we first introduce the framework and then illustrate the method. Furthermore, we use a Monte Carlo simulation to validate the proposed method. In Section 4 and 5, we describe the data and elaborate on the results. Finally, we discuss the findings in Section 6 and conclude our paper in Section 7.

Section snippets

Relationship between price and consumer reviews

Previous research has found that price affects volume and valence of consumer reviews through various mediating factors, such as sales, satisfaction [11], loyalty [12,13], and biased acquisition [14]. First, consumer reviews can be posted on an e-commerce website only after purchase. More sales could possibly lead to more reviews and vice versa. Price affects sales and is ultimately reflected in volume of consumer reviews. In addition, when facing quality uncertainty, consumers are likely to

Methodology

In this section, we present the methodology. We first describe the framework and then introduce the model searching and model selection. To validate the proposed method, a Monte Carlo simulation is performed to examine whether it could accurately discover the models.

Data

The dataset is from Jingdong (JD.com), one of the largest B2C online retailers in China. JD.com has 236.5 million active customer accounts and US$26.7 billion GMV for the first quarter of 2017.3 Similar to Amazon.com, JD.com provides a platform where consumers can write post-purchase reviews. If a review is approved as valid by JD.com according to its criteria described on the website, the user can obtain 10

Results and findings

In this section, we describe the found models and analyze the relationships between the prices and volume/valence of reviews described by the models.

Product category

The aforementioned experiments were conducted on the whole dataset for all the products. Different products play a different role in consumers' purchase decisions. The literature has divided products into two categories: high-involvement products and low-involvement products. Product involvement-levels refer to a consumer's perceived importance or interest in a product [36]. High-involvement products mean products that are more important and interesting to consumer. Typical examples of

Conclusions and implications

Much recent research has indicated that consumer reviews are critical to product management and how product managers engage in proactive marketing efforts, such as modifying the retail price, to affect the online review behavior of consumers after shopping. Thus, the model is necessary and helpful for depicting the relationships between reviews and prices. Considering this information, we designed an intelligent data-driven Generate/Test Cycle by introducing a machine learning approach without

Acknowledgement

This work was supported by the National Natural Science Foundation of China (42071273, 71671024, 71871041, 71421001), Research Funds of Education Department of Liaoning Province (LN2020Q30), Research Funds of Dongbei University of Finance and Economics (DUFE2020Q08), Fundamental Research Funds for the Central Universities (DUT20JC38), Liaoning Revitalization Talents Program (XLYC1807143), and Social Planning Foundation of Liaoning (L17AGL012).

Xian Yang is an assistant professor at School of Management Science and Engineering in Dongbei University of Finance and Economics. Her research focuses on online reviews, computational intelligence and business data analysis.

References (44)

  • M. Olmedilla et al.

    The superhit effect and long tail phenomenon in the context of electronic word of mouth

    Decis. Support. Syst.

    (2019)
  • X. Lu et al.

    Promotional marketing or word-of-mouth? Evidence from online restaurant reviews

    Inf. Syst. Res.

    (2013)
  • A. Gesenhues

    Survey: 90% of Customers Say Buying Decisions Are Influenced by Online Reviews

    (2013)
  • M. Luca

    Reviews, reputation, and revenue: The case of Yelp.com, Harvard Business Review

    (2011)
  • V.R. Rao

    Pricing research in marketing: the state of the art

    J. Bus.

    (1984)
  • M. Puccinelli et al.

    Customer experience management in retailing: understanding the buying process

    J. Retail.

    (2009)
  • X. Li et al.

    Price effects in online product reviews: an analytical model and empirical analysis

    MIS Q.

    (2010)
  • N. Archak et al.

    Deriving the pricing power of product features by mining consumer reviews

    Manag. Sci.

    (2011)
  • E. Anderson

    Customer satisfaction and word of mouth

    J. Serv. Res.

    (1998)
  • L. Krishnamurthi et al.

    An empirical analysis of the relationship between loyalty and consumer price elasticity

    Mark. Sci.

    (1991)
  • D. Bowman et al.

    Managing customer-initiated contacts with manufactures: the impact on share of category requirements and word-of-mouth behavior

    J. Mark. Res.

    (2001)
  • N. Hu et al.

    On self-selection biases in online product reviews

    MIS Q.

    (2017)
  • Cited by (0)

    Xian Yang is an assistant professor at School of Management Science and Engineering in Dongbei University of Finance and Economics. Her research focuses on online reviews, computational intelligence and business data analysis.

    Guangfei Yang is a professor at Institute of Systems Engineering in Dalian University of Technology. He received his doctoral degree in engineering at Waseda University in 2009. His research is about data mining and computational intelligence.

    Jiangning Wu is a professor at Institute of Systems Engineering in Dalian University of Technology. She received her PhD at The University of Hong Kong. Her research is about data mining and business intelligence.

    Yanzhong Dang is a professor at Institute of Systems Engineering in Dalian University of Technology. His research is about data mining, knowledge management and business intelligence.

    Weiguo Fan is Henry B. Tippie Chair Professor in Business Analytics at the University of Iowa. He received his PhD in Business Administration from the Ross School of Business, University of Michigan, Ann Arbor, in 2002. His research interests focus on the design and development of novel information technologies — information retrieval, data mining, text analytics, social media analytics, business intelligence techniques — to support better business information management and decision making.

    View full text