• python
  • javascript
  • reactjs
  • sql
  • c#
  • java
Facebook Twitter Instagram
Devs Fixed
  • python
  • javascript
  • reactjs
  • sql
  • c#
  • java
Devs Fixed
Home ยป Resolved: Adding a new column to a dataframe with a value which is based on the values from next rows

Resolved: Adding a new column to a dataframe with a value which is based on the values from next rows

0
By Isaac Tonny on 17/06/2022 Issue
Share
Facebook Twitter LinkedIn

Question:

I have a dataframe as shown below,
Need to add new column and add value only for rows where fieldmname is “jobstage”. and the value should be latest status (check in next rows) for that corresponding jobstage. while selecting latest need to check for coltype value if it’s “status”.
Expected dataframe:
I tried with lead, lag, row_number but not getting expected result.

Answer:

The question is tagged pyspark, so I’m writing a way to do the required in pyspark using the first() window function.
So, it will consider the first record from the corresponding records where fieldmname is “jobstatus” and coltype is “status”.

If you have better answer, please add a comment about this, thank you!

apache-spark pyspark scala
Share. Facebook Twitter LinkedIn

Related Posts

Resolved: Why doesn’t stringstream consume output during hex formatting?

24/03/2023

Resolved: CMake path for python3 libraries doesn’t change (windows 10)

24/03/2023

Resolved: GLMM with quasi-Poisson distribution

24/03/2023

Leave A Reply

© 2023 DEVSFIX.COM

Type above and press Enter to search. Press Esc to cancel.